<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">AGILE-GISS</journal-id>
<journal-title-group>
<journal-title>AGILE: GIScience Series</journal-title>
<abbrev-journal-title abbrev-type="publisher">AGILE-GISS</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">AGILE GIScience Ser.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2700-8150</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/agile-giss-7-46-2026</article-id>
<title-group>
<article-title>Should We Ask an LLM? Evaluating Toponym Disambiguation across Administrative Levels</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Welscher</surname>
<given-names>Franz</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Smith</surname>
<given-names>Paddy</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Leppämäki</surname>
<given-names>Tatu</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Ilyankou</surname>
<given-names>Ilya</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Geoinformatics, University of Salzburg, Salzburg, Austria</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>School of Geography, University of Leeds, Leeds, UK</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, Finland</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>SpaceTimeLab, Department of Civil, Environmental, and Geomatic Engineering, UCL, London, UK</addr-line>
</aff>
<pub-date pub-type="epub">
<day>10</day>
<month>06</month>
<year>2026</year>
</pub-date>
<volume>7</volume>
<elocation-id>46</elocation-id>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Franz Welscher et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.html">This article is available from https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.html</self-uri>
<self-uri xlink:href="https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.pdf">The full text article is available as a PDF file from https://agile-giss.copernicus.org/articles/7/46/2026/agile-giss-7-46-2026.pdf</self-uri>
<abstract>
<p>Toponym disambiguation, determining which real-world location a place name refers to, is a critical step in geoparsing pipelines, yet existing evaluations mix disambiguation quality with the behavior of downstream geocoders through distance-based metrics. We propose evaluating disambiguation as a standalone task by prompting nine LLMs to predict administrative containment (ADM0 to ADM2) from textual context, scoring predictions directly with precision, recall, and F1 against GADM-derived labels on the LGL corpus. Performance declines systematically with administrative granularity, mid-sized models outperform the largest tested model, and recurring failure cases cluster around geopolitically complex regions. These findings suggest that feeding fine-grained LLM disambiguation outputs to geocoders may harm rather than help performance.</p>
</abstract>
<counts><page-count count="7"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>