期刊
INFORMATION SYSTEMS
卷 99, 期 -, 页码 -出版社
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.is.2021.101740
关键词
Low-quality content in social networks; Spam accounts; Real-time detection system; Deep learning techniques
资金
- Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, Saudi Arabia [DF-777-165-1441]
Social networks have generated vast amounts of data for research and business purposes; harmful activities on social media can affect user satisfaction and pose challenges for other systems; this paper presents real-time classification methods for identifying low-quality tweets and proposes a lightweight model for detecting spamming Twitter accounts in real-time.
Social networks have generated immense amounts of data that have been successfully utilized for research and business purposes. The approachability and immediacy of social media have also allowed ill-intentioned users to perform several harmful activities that include spamming, promoting, and phishing. These activities generate massive amounts of low-quality content that often exhibits duplicate, automated, inappropriate, or irrelevant content that subsequently affects users' satisfaction and imposes a significant challenge for other social media-based systems. Several real-time systems were developed to tackle this problem by focusing on filtering a specific kind of low-quality content. In this paper, we present a fine-grained real-time classification approach to identify several types of low-quality tweets (i.e., phishing, promoting, and spam tweets) written in Arabic. The system automatically extracts textual features using deep learning techniques without relying on hand-crafted features that are often time-consuming to be obtained and are tailored for a single type of low-quality content. This paper also proposes a lightweight model that utilizes a subset of the textual features to identify spamming Twitter accounts in a real-time setting. The proposed methods are evaluated on a real-world dataset (40, 000 tweets and 1, 000 accounts), showing superior performance in both models with accuracy and F1-scores of 0.98. The proposed system classifies a tweet in less than five milliseconds and an account in less than a second. (C) 2021 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据