AGILE-GISS

AGILE: GIScience Series

AGILE-GISS

AGILE GIScience Ser.

2700-8150

Copernicus Publications

Göttingen, Germany

10.5194/agile-giss-7-46-2026

Should We Ask an LLM? Evaluating Toponym Disambiguation across Administrative Levels

Welscher

Franz

¹ Smith

Paddy

² Leppämäki

Tatu

³ Ilyankou

Ilya

⁴

Department of Geoinformatics, University of Salzburg, Salzburg, Austria

School of Geography, University of Leeds, Leeds, UK

Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, Finland

SpaceTimeLab, Department of Civil, Environmental, and Geomatic Engineering, UCL, London, UK

10 06 2026

2026

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

This article is available from https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.html

The full text article is available as a PDF file from https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.pdf

Toponym disambiguation, determining which real-world location a place name refers to, is a critical step in geoparsing pipelines, yet existing evaluations mix disambiguation quality with the behavior of downstream geocoders through distance-based metrics. We propose evaluating disambiguation as a standalone task by prompting nine LLMs to predict administrative containment (ADM0 to ADM2) from textual context, scoring predictions directly with precision, recall, and F1 against GADM-derived labels on the LGL corpus. Performance declines systematically with administrative granularity, mid-sized models outperform the largest tested model, and recurring failure cases cluster around geopolitically complex regions. These findings suggest that feeding fine-grained LLM disambiguation outputs to geocoders may harm rather than help performance.