期刊
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
卷 34, 期 8, 页码 6048-6056出版社
ELSEVIER
DOI: 10.1016/j.jksuci.2021.07.013
关键词
Offensive language detection; Social media; Multilingual; Transfer learning; Text classification; Natural language processing
This study aims to tackle the problem of offensive communications on social media by using computational techniques and transfer learning models. The proposed approach, based on BERT and translation-based techniques, achieves high performance in terms of F1-score and accuracy for multilingual offensive language detection.
Offensive communications have invaded social media content. One of the most effective solutions to cope with this problem is using computational techniques to discriminate offensive content. Moreover, social media users are from linguistically different communities. This study aims to tackle the Multilingual Offensive Language Detection (MOLD) task using transfer learning models and the fine-tuning phase. We propose an effective approach based on the Bidirectional Encoder Representations from Transformers (BERT) that has shown great potential in capturing the semantics and contextual information within texts. The proposed system consists of several stages: (1) Preprocessing, (2) Text representation using BERT models, and (3) Classification into two categories: Offensive and non-offensive. To handle multilingualism, we explore different techniques such as the joint-multilingual and translation-based ones. The first consists in developing one classification system for different languages, and the second involves the translation phase to transform all texts into one universal language then classify them. We conduct several experiments on a bilingual dataset extracted from the Semi-supervised Offensive Language Identification Dataset (SOLID). The experimental findings show that the translation-based method in conjunction with Arabic BERT (AraBERT) achieves over 93% and 91% in terms of F1-score and accuracy, respectively.(c) 2021 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据