☆ 4.4 Article

A Parallel Approach for Sentiment Analysis on Social Networks Using Spark

INTELLIGENT AUTOMATION AND SOFT COMPUTING (2023)

Journal

INTELLIGENT AUTOMATION AND SOFT COMPUTING

Volume 35, Issue 2, Pages 1831-1842

Publisher

TECH SCIENCE PRESS

DOI: 10.32604/iasc.2023.029036

Keywords

Social networks; sentiment analysis; big data; spark; tweets; classi fi cation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Social media has become a vital platform for public opinion, and efficient sentiment analysis methods are needed to handle large datasets. This research proposes a scalable system using Apache Spark and a Naive Bayes training technique for sentiment analysis on Twitter, achieving significant improvements in processing speed and cost-effectiveness.

The public is increasingly using social media platforms such as Twitter and Facebook to express their views on a variety of topics. As a result, social media has emerged as the most effective and largest open source for obtaining public opinion. Single node computational methods are inefficient for sentiment analysis on such large datasets. Supercomputers or parallel or distributed proces-sing are two options for dealing with such large amounts of data. Most parallel programming frameworks, such as MPI (Message Processing Interface), are dif-ficult to use and scale in environments where supercomputers are expensive. Using the Apache Spark Parallel Model, this proposed work presents a scalable system for sentiment analysis on Twitter. A Spark-based Naive Bayes training technique is suggested for this purpose; unlike prior research, this algorithm does not need any disk access. Millions of tweets have been classified using the trained model. Experiments with various-sized clusters reveal that the suggested strategy is extremely scalable and cost-effective for larger data sets. It is nearly 12 times quicker than the Map Reduce-based model and nearly 21 times faster than the Naive Bayes Classifier in Apache Mahout. To evaluate the framework's scalabil-ity, we gathered a large training corpus from Twitter. The accuracy of the classi-fier trained with this new dataset was more than 80%.

A Parallel Approach for Sentiment Analysis on Social Networks Using Spark

Journal

INTELLIGENT AUTOMATION AND SOFT COMPUTING

Publisher

TECH SCIENCE PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A Parallel Approach for Sentiment Analysis on Social Networks Using Spark

Journal

INTELLIGENT AUTOMATION AND SOFT COMPUTING

Publisher

TECH SCIENCE PRESS

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper