4.5 Article

A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter

期刊

INFORMATION SYSTEMS
卷 99, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.is.2021.101740

关键词

Low-quality content in social networks; Spam accounts; Real-time detection system; Deep learning techniques

资金

  1. Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia [DF-777-165-1441]

向作者/读者索取更多资源

Social networks have generated vast amounts of data for research and business purposes; harmful activities on social media can affect user satisfaction and pose challenges for other systems; this paper presents real-time classification methods for identifying low-quality tweets and proposes a lightweight model for detecting spamming Twitter accounts in real-time.
Social networks have generated immense amounts of data that have been successfully utilized for research and business purposes. The approachability and immediacy of social media have also allowed ill-intentioned users to perform several harmful activities that include spamming, promoting, and phishing. These activities generate massive amounts of low-quality content that often exhibits duplicate, automated, inappropriate, or irrelevant content that subsequently affects users' satisfaction and imposes a significant challenge for other social media-based systems. Several real-time systems were developed to tackle this problem by focusing on filtering a specific kind of low-quality content. In this paper, we present a fine-grained real-time classification approach to identify several types of low-quality tweets (i.e., phishing, promoting, and spam tweets) written in Arabic. The system automatically extracts textual features using deep learning techniques without relying on hand-crafted features that are often time-consuming to be obtained and are tailored for a single type of low-quality content. This paper also proposes a lightweight model that utilizes a subset of the textual features to identify spamming Twitter accounts in a real-time setting. The proposed methods are evaluated on a real-world dataset (40, 000 tweets and 1, 000 accounts), showing superior performance in both models with accuracy and F1-scores of 0.98. The proposed system classifies a tweet in less than five milliseconds and an account in less than a second. (C) 2021 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据