☆ 4.5 Article

Extraction of temporal information from social media messages using the BERT model

EARTH SCIENCE INFORMATICS (2022)

期刊

EARTH SCIENCE INFORMATICS

卷 15, 期 1, 页码 573-584

出版社

SPRINGER HEIDELBERG

DOI: 10.1007/s12145-021-00756-6

关键词

Temporal information extraction; Temporal expression recognition; BERT; Natural language processing

类别

Computer Science, Interdisciplinary Applications Geosciences, Multidisciplinary

资金

National Natural Science Foundation of China [42050101, U1711267, 41871311, 41871305]
Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing [KLIGIP-2021A01]
Major scientific and technological innovation projects in Shandong Province [2019JZZY020105]
China Postdoctoral Science Foundation [2021M702991]
Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan) [CUG2106116]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Temporal information extraction from social media messages is crucial for geographical applications. A deep learning-based algorithm, BERT-BiLSTM-CRF, was proposed for automatically extracting temporal information. Experimental results demonstrate that the proposed method outperforms the current state-of-the-art models in extracting temporal information from Chinese social media texts.

Temporal information extraction from social media messages is of critical importance to several geographical applications. Combined with the characteristics of temporal information descriptions in Chinese text, different time expression patterns formed by time unit combinations are summarized. A deep learning-based information extraction algorithm (named BERT-BiLSTM-CRF) for automatically extracting temporal information from social media messages is proposed. Based on the bidirectional long short-term memory-conditional random field (BiLSTM-CRF) model, the BERT (bidirectional encoder representations from transformers) pretrained language model was used to enhance the generalization ability of the word vector model to capture long-range contextual information; then, the trained word vector was input into the BiLSTM-CRF model for further training. The proposed model was then evaluated on the constructed corpus, a set of manually annotated Chinese texts from social media messages. Among the basic models, the BERT-BiLSTM-CRF achieved the highest average F1-score of 85%. The experimental results show that the proposed method outperforms the current state-of-the-art models.

Extraction of temporal information from social media messages using the BERT model

期刊

EARTH SCIENCE INFORMATICS

出版社

SPRINGER HEIDELBERG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Extraction of temporal information from social media messages using the BERT model

期刊

EARTH SCIENCE INFORMATICS

出版社

SPRINGER HEIDELBERG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文