☆ 4.7 Article

Using unsupervised information to improve semi-supervised tweet sentiment classification

INFORMATION SCIENCES (2016)

Journal

INFORMATION SCIENCES

Volume 355, Issue -, Pages 348-365

Publisher

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2016.02.002

Keywords

Tweet sentiment analysis; Semi-supervised learning

Funding

Capes [DS-7253238/D]
CNPq [303348/2013-5]
FAPESP [2013/07375-0, 2010/20830-0]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Supervised algorithms require a set of representative labeled data for building classification models. However, labeled data are usually difficult and expensive to obtain, which motivates the interest in semi-supervised learning. This type of learning uses both labeled and unlabeled data in the training process and is particularly useful in applications such as tweet sentiment analysis, where a large amount of unlabeled data is available. Semi supervised learning for tweet sentiment analysis, although quite appealing, is relatively new. We propose a semi-supervised learning framework that combines unsupervised information, captured from a similarity matrix constructed from unlabeled data, with a classifier. Our motivation is that such a similarity matrix is a powerful knowledge-discovery tool that can help classify unlabeled tweet sets. Our framework makes use of the well-known Self-training algorithm to induce a better tweet sentiment classifier. Experimental results in real-world datasets demonstrate that the proposed framework can improve the accuracy of tweet sentiment analysis. (C) 2016 Elsevier Inc. All rights reserved.

Using unsupervised information to improve semi-supervised tweet sentiment classification

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Using unsupervised information to improve semi-supervised tweet sentiment classification

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper