Extracting interrogative intents and concepts from geo-analytic questions
Keywords: Geo-analytic questions, Geographic questions, Information extraction, Grammatical parser, Concepts and intents, Geographic question-answering systems
Abstract. Understanding syntactic and semantic structure of geographic questions is a necessary step towards true geographic question-answering (GeoQA) machines. The empirical basis for the understanding of the capabilities expected from GeoQA systems are geographic question corpora. Available corpora in English have been mostly drawn from generic Web search logs or limited user studies, supporting the focus of GeoQA systems on retrieving factoids: factual knowledge about particular places and everyday processes. Yet, the majority of questions enquired about in the spatial sciences go beyond simple place facts, with more complex analytical intents informing the questions. In this paper, we introduce a new corpus of geo-analytic questions drawn from English textbooks and scientific articles. We analyse and compare this corpus with two general-purpose GeoQA corpora in terms of grammatical complexity and semantic concepts, using a new parsing method that allows us to differentiate and quantify patterns of a question’s intent.