<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">AGILE-GISS</journal-id>
<journal-title-group>
<journal-title>AGILE: GIScience Series</journal-title>
<abbrev-journal-title abbrev-type="publisher">AGILE-GISS</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">AGILE GIScience Ser.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2700-8150</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/agile-giss-7-14-2026</article-id>
<title-group>
<article-title>Investigating the Generalizability of Segment Anything Model for Large-Scale Geospatial Segmentation</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Mansour</surname>
<given-names>Wejdene</given-names>
<ext-link>https://orcid.org/0009-0008-4362-2092</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Walther</surname>
<given-names>Paul</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Li</surname>
<given-names>Hao</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Werner</surname>
<given-names>Martin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Aerospace and Geodesy, TUM School of Engineering and Design, Technical University of Munich, Germany</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Department of Geography, National University of Singapore, Singapore</addr-line>
</aff>
<pub-date pub-type="epub">
<day>10</day>
<month>06</month>
<year>2026</year>
</pub-date>
<volume>7</volume>
<elocation-id>14</elocation-id>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Wejdene Mansour et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://agile-giss.copernicus.org/articles/7/14/2026/agile-giss-7-14-2026.html">This article is available from https://agile-giss.copernicus.org/articles/7/14/2026/agile-giss-7-14-2026.html</self-uri>
<self-uri xlink:href="https://agile-giss.copernicus.org/articles/7/14/2026/agile-giss-7-14-2026.pdf">The full text article is available as a PDF file from https://agile-giss.copernicus.org/articles/7/14/2026/agile-giss-7-14-2026.pdf</self-uri>
<abstract>
<p>Foundation Models (FMs) are promising approaches in multimodal artificial intelligence as they provide foundational task knowledge across computer vision, language understanding, and related domains. Despite their success, the extent to which FMs generalize to domain-specific tasks remains unclear, especially in Earth System Sciences (ESS). In this work, we investigate the geographical and task-level generalizability of Segment Anything Model (SAM) and the vision&amp;ndash;language FMs CLIP and Grounding DINO, across two distinct vision tasks: 1) building footprint segmentation from high-quality airborne images at 40cm ground sampling distance (GSD) and 2) surface water segmentation from Sentinel-2 imagery at about 10m GSD. Herein, we explore strategies to improve the zero-shot applicability of the general-purpose SAM by combining it with other pre-trained FMs for detection and classification, and we evaluate the potential performance gains achievable with minimal computational overhead through few-shot adapters on the datasets. Furthermore, we assess whether remote-sensing-specific training in RemoteCLIP and RemoteSAM leads to meaningful improvements over their general-purpose counterparts in large-scale geospatial segmentation. Overall, we conclude that domain-specific FMs can provide performance gains in certain settings, but are neither required nor always useful when compared with lightweight adaptation strategies and mixtures of different general models. This suggests that a more economical pathway might be to increase the remote sensing data used in the training of general FMs instead of training dedicated models specifically for ESS.</p>
</abstract>
<counts><page-count count="14"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>