Journal cover Journal topic
AGILE: GIScience Series Open-access proceedings of the Association of Geographic Information Laboratories in Europe
Journal topic
Articles | Volume 3
AGILE GIScience Ser., 3, 9, 2022
AGILE GIScience Ser., 3, 9, 2022
10 Jun 2022
10 Jun 2022

Geoparsing: Solved or Biased? An Evaluation of Geographic Biases in Geoparsing

Zilong Liu1,2, Krzysztof Janowicz1,2,3, Ling Cai1,2, Rui Zhu1,2, Gengchen Mai1,2,4, and Meilin Shi1,2 Zilong Liu et al.
  • 1STKO Lab, Department of Geography, University of California, Santa Barbara, USA
  • 2Center for Spatial Studies, University of California, Santa Barbara, USA
  • 3Department of Geography and Regional Research, University of Vienna, Austria
  • 4Department of Computer Science, Stanford University, USA

Keywords: geoparsing, spatially-explicit evaluation, regional variability, geographic bias, evaluation bias mitigation

Abstract. Geoparsing, the task of extracting toponyms from texts and associating them with geographic locations, has witnessed remarkable progress over the past years. However, despite its intrinsically geospatial nature, existing evaluations tend to focus on overall performance while paying little attention to its variation across geographic space. In this work, we attempt to answer the question whether geoparsing is solved or biased by conducting a spatially-explicit evaluation, namely an evaluation of the regional variability in geoparsing performance. Particularly, we will analyze the spatial autocorrelation underlying this regional variability. By performing hot and cold spot detection over results of several open-source geoparsers, we observe that none of them performs equally well across geographic space, and some are geographically biased towards some regions but against others. We also carry out a comparative experiment showing that stateof- the-art geoparsers developed with neural networks do not necessarily outperform the off-the-shelf tools across geographic space. To understand the implications behind this observed regional variability, we evaluate geographic biases involved in geoparsing research centered around data contribution and usage, algorithm design, and performance evaluations. Particularly, our spatially-explicit performance evaluation serves as an approach to evaluation bias mitigation in geoparsing.We conclude that previous performance evaluations published in the literature are overly optimistic, thus hiding the fact that geoparsing is far from solved, and geoparsers require debiasing in addition to further considerations when being applied to (geospatial) downstream tasks.

Publications Copernicus