4.6 Article

Feature extraction using LR-PCA hybridization on twitter data and classification accuracy using machine learning algorithms

出版社

SPRINGER
DOI: 10.1007/s10586-018-2158-3

关键词

Social networks; Twitter; PCA; Logistic regression; Machine learning

向作者/读者索取更多资源

Twitter, a social blogging site which became the tremendous topic in today's environment, which made several organizations and public to develop their identity and overwhelming through this social website. But unfortunately, twitter facing great challenges due to spammers who break the reputation of the website from deliberate users to stop using it. Researchers have proposed many techniques to overcome the issues faced by the spammers. As far researchers find a new path so as the spammers develop new techniques to travel in that path. So far, many algorithms were proposed to detect the spammers and some extraction techniques have developed to increase the potential of detection rate. In this paper, the main focus is about feature extraction of our data with a hybrid approach of combining logistic regression with dimensional reduction technique using principal component analysis. Our dataset contains 17 million users' tweets with 159 features included in it. Then we are going to extract particular features from it which would be helpful for the further process of increasing the classification accuracy. For the classification process, our work extended for the process of classification of data using some machine learning techniques. From the proposed work the detection rate could be increased by using particular features for the classification process.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据