Version 1
: Received: 20 October 2023 / Approved: 20 October 2023 / Online: 23 October 2023 (05:43:30 CEST)
How to cite:
Yin, D.; Yin, C.; Liu, W.; Wu, H.; Liu, K.; Wang, Y. Rapid Production Method of Massive Thematic Maps Based on Geospatial Knowledge Extraction. Preprints2023, 2023101345. https://doi.org/10.20944/preprints202310.1345.v1
Yin, D.; Yin, C.; Liu, W.; Wu, H.; Liu, K.; Wang, Y. Rapid Production Method of Massive Thematic Maps Based on Geospatial Knowledge Extraction. Preprints 2023, 2023101345. https://doi.org/10.20944/preprints202310.1345.v1
Yin, D.; Yin, C.; Liu, W.; Wu, H.; Liu, K.; Wang, Y. Rapid Production Method of Massive Thematic Maps Based on Geospatial Knowledge Extraction. Preprints2023, 2023101345. https://doi.org/10.20944/preprints202310.1345.v1
APA Style
Yin, D., Yin, C., Liu, W., Wu, H., Liu, K., & Wang, Y. (2023). Rapid Production Method of Massive Thematic Maps Based on Geospatial Knowledge Extraction. Preprints. https://doi.org/10.20944/preprints202310.1345.v1
Chicago/Turabian Style
Yin, D., Kexin Liu and Yanhui Wang. 2023 "Rapid Production Method of Massive Thematic Maps Based on Geospatial Knowledge Extraction" Preprints. https://doi.org/10.20944/preprints202310.1345.v1
Abstract
Geospatial knowledge in massive academic papers can provide knowledge services such as location-based research hotspot analysis, spatio-temporal data aggregation, research results recommendation, etc. However, geospatial knowledge often exists implicitly in literature resources in unstructured form, which is difficult to be directly accessed and mined and utilized for rapid production of massive thematic maps. In this paper, we take the geospatial knowledge of the area studied as an example and introduce its extraction method in detail. An integrated feature template matching and random forest classification algorithm is proposed for accurately identifying research areas from the abstract texts of academic papers and producing thematic maps. Firstly, the precise recognition of geographical names is achieved step by step based on BiLSTM-CRF algorithm and improved heuristic disambiguation method; then, the area studied is extracted by the designed integrated feature recognition template of area studied using random forest classification algorithm, and a fast thematic map is designed for the knowledge of area studied, topic and literature. The experimental results show that the area studied recognition accuracy can reach 97%, the F-value is 96%, and the recall rate reaches 96%, achieving high accuracy and high efficiency of area studied extraction in text. Based on the geospatial knowledge, the thematic map can achieve the effect of fast map formation and accurate expression.
Keywords
area studied; BLFR model; BI-LSTM-CRF; improved heuristic disambiguation method; feature template; random forest
Subject
Environmental and Earth Sciences, Geography
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.