Comparative Evaluation of Keyphrase Extraction Tools for Semantic Analysis of Climate Change Scientific Reports and Ontology Enrichment
Keywords: Keyphrase Extraction, Ontology Enrichment, SWEET ontology, Climate Change, Geospatial Concepts
Abstract. Keyphrase extraction is a process used for identifying important concepts and entities within unstructured information sources to facilitate ontology enrichment, semantic analysis, and information retrieval. In this paper, three different tools for key phrase extraction are compared to evaluate their accuracy and effectiveness for extracting geospatial and climate change concepts from climate change reports: frequency-inverse document frequency (TF-IDF), Amazon Comprehend, and YAKE. Climate change reports contain vital information for comprehending the complexity of climate change causes, impacts, and interconnections, and include wealth of information on geospatial concepts, locations, and events but the diverse terminology used complicates information extraction and organization. The highest scoring keyphrases are further used to enrich and populate the SWEET ontology with concepts and instances related to climate change and meaningful relations between them to support semantic representation and formalization of knowledge.