☆ 4.5 Article

A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter

INFORMATION SYSTEMS (2021)

期刊

INFORMATION SYSTEMS

卷 99, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.is.2021.101740

关键词

Low-quality content in social networks; Spam accounts; Real-time detection system; Deep learning techniques

类别

Computer Science, Information Systems

资金

Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia [DF-777-165-1441]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Social networks have generated vast amounts of data for research and business purposes; harmful activities on social media can affect user satisfaction and pose challenges for other systems; this paper presents real-time classification methods for identifying low-quality tweets and proposes a lightweight model for detecting spamming Twitter accounts in real-time.

Social networks have generated immense amounts of data that have been successfully utilized for research and business purposes. The approachability and immediacy of social media have also allowed ill-intentioned users to perform several harmful activities that include spamming, promoting, and phishing. These activities generate massive amounts of low-quality content that often exhibits duplicate, automated, inappropriate, or irrelevant content that subsequently affects users' satisfaction and imposes a significant challenge for other social media-based systems. Several real-time systems were developed to tackle this problem by focusing on filtering a specific kind of low-quality content. In this paper, we present a fine-grained real-time classification approach to identify several types of low-quality tweets (i.e., phishing, promoting, and spam tweets) written in Arabic. The system automatically extracts textual features using deep learning techniques without relying on hand-crafted features that are often time-consuming to be obtained and are tailored for a single type of low-quality content. This paper also proposes a lightweight model that utilizes a subset of the textual features to identify spamming Twitter accounts in a real-time setting. The proposed methods are evaluated on a real-world dataset (40, 000 tweets and 1, 000 accounts), showing superior performance in both models with accuracy and F1-scores of 0.98. The proposed system classifies a tweet in less than five milliseconds and an account in less than a second. (C) 2021 Elsevier Ltd. All rights reserved.

A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter

期刊

INFORMATION SYSTEMS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter

期刊

INFORMATION SYSTEMS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文