4.5 Article

Text visualization for geological hazard documents via text mining and natural language processing

期刊

EARTH SCIENCE INFORMATICS
卷 15, 期 1, 页码 439-454

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s12145-021-00732-0

关键词

Geological disaster report; Text mining; Natural language processing; Text visualization analysis

资金

  1. National Natural Science Foundation of China [42050101, U1711267, 41871311, 41871305]
  2. National Key Research and Development Program [2018YFB0505500, 2018YFB0505504]
  3. Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) [CUG2106116]

向作者/读者索取更多资源

This research describes a flow framework for automatically extracting key information from geological disaster reports using text mining and visualization techniques. The extracted information is transformed into a simple and intuitive form for managers/researchers to quickly navigate, understand, and make informed decisions. The proposed approach utilizes an optimized term frequency-inverse document frequency algorithm, word cloud, co-occurrence network analysis, dependency grammar, and knowledge graphs for information analysis and visualization.
An increasing number of geological hazard documents about the mechanism and occurrence process of geological disasters contain unstructured geoscientific data that are not fully utilized. Text mining and visualization techniques offer opportunities to leverage this wealth of data and extract valuable information from dense, abstract geological disaster reports to quickly focus on the core information in geological reports and improve the efficiency of report usage. In this research, a flow framework for the automatic extraction of key information and its transformation to a simple and intuitive form for managers/researchers to quickly navigate, understand and make more informed decisions based on the key information are described. To automatically extract key information from text, an optimized term frequency-inverse document frequency algorithm is proposed to analyze text characteristics. The important information extracted from a case study document is demonstrated using a word cloud. Co-occurrence network analysis is used to present key content from geological reports and describe the correlations between words. We use the dependency grammar technique to extract triads of geological report text information and we visualize them using knowledge graphs. The results show that text visualization analysis can be used to identify the types and locations of geological disasters in reports, highlight key information from survey reports as an auxiliary public resource, and more rapidly analyze the key contents of a large number of geological disaster survey reports.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据