4.7 Review

Building Fake Review Detection Model Based on Sentiment Intensity and PU Learning

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2023.3234427

关键词

Training; Dictionaries; Blogs; Data models; Sentiment analysis; Predictive models; Feature extraction; Fake reviews; positive-unlabeled (PU) learning; semi-supervised learning; sentiment analysis

向作者/读者索取更多资源

This article proposes a fake review detection model based on sentiment intensity and PU learning (SIPUL), which can continuously learn and predict from constantly arriving streaming data. The model divides the reviews into subsets according to sentiment intensity and extracts initial positive and negative samples using a marking mechanism. A semi-supervised positive-unlabeled (PU) learning detector is built based on the initial samples to detect fake reviews iteratively. The model effectively detects fake reviews, especially deceptive ones.
Fake review detection has the characteristics of huge stream data processing scale, unlimited data increment, dynamic change, and so on. However, the existing fake review detection methods mainly target limited and static review data. In addition, deceptive fake reviews have always been a difficult point in fake review detection due to their hidden and diverse characteristics. To solve the above problems, this article proposes a fake review detection model based on sentiment intensity and PU learning (SIPUL), which can continuously learn the prediction model from the constantly arriving streaming data. First, when the streaming data arrive, the sentiment intensity is introduced to divide the reviews into different subsets (i.e., strong sentiment set and weak sentiment set). Then, the initial positive and negative samples are extracted from the subset using the marking mechanism of selection completely at random (SCAR) and Spy technology. Second, building a semi-supervised positive-unlabeled (PU) learning detector based on the initial sample to detect fake reviews in the data stream iteratively. According to the detection results, the data of initial samples and the PU learning detector are continuously updated. Finally, the old data are continually deleted according to the historical record points, so that the training sample data are within a manageable size and prevent overfitting. Experimental results show that the model can effectively detect fake reviews, especially deceptive reviews.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据