4.6 Article

Sparse Self-Attention LSTM for Sentiment Lexicon Construction

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASLP.2019.2933326

Keywords

Sentiment lexicon; sentiment analysis; sparsity; deep learning; text mining

Funding

  1. National Natural Science Foundation of China [61822601, 61773050, 61632004]
  2. Beijing Natural Science Foundation [Z180006]
  3. Beijing Municipal Science & Technology Commission [Z181100008918012]
  4. National Key Research and Development Program [2017YFC1703506]

Ask authors/readers for more resources

Sentiment lexicon is a very important resource for opinion mining. Recently, many state-of-the-art works employ deep learning techniques to construct sentiment lexicons. In general, they firstly learn sentiment-aware word embeddings, and then use it as word features to construct sentiment lexicons. However, these methods do not consider the importance of each word to the distinguish of documents' sentiment polarities. As we know, most words among a document do not contribute to understand documents' semantic or sentiment. For example, in the tweet It's a good day, but i can't feel it. I'm really unhappy. The words 'unhappy', 'feel' and 'can't' are much more important than the words 'good', 'day' in predicting the sentiment polarity of this twitter. Meanwhile, many words, such as 'the', 'in', 'it' and 'I'm' are uninformative. In this paper, we propose a novel sparse self-attention LSTM (SSALSTM) to efficiently capture the above intuitive facts, and then construct a large scale sentiment lexicons in twitter. In SSALSTM, we use a novel self-attention mechanism to capture the importance of each words to the distinguish of documents' sentiment polarities. In addition, a L-1 regularize is applied in the attentions which can ensure the sparsity characters that most words in a document are semantic and sentiment indistinguishable. Once we learn an efficient sentiment-aware word embedding, we train a classifier which uses sentiment-aware word embedding as features to predict the sentiment polarities of words. Extensive experiments on four publicly available datasets, SemEval 2013-2016, indicate that the sentiment lexicon generated by our proposed model achieves state-of-the-art performance on both supervised and unsupervised sentiment classification tasks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available