4.2 Article

Exploring and Inferring User-User Pseudo-Friendship for Sentiment Analysis with Heterogeneous Networks

Journal

STATISTICAL ANALYSIS AND DATA MINING
Volume 7, Issue 4, Pages 308-321

Publisher

WILEY
DOI: 10.1002/sam.11223

Keywords

heterogeneous information network; semi-supervised refining model; sentiment analysis; dissimilarity

Funding

  1. Direct For Computer & Info Scie & Enginr
  2. Div Of Information & Intelligent Systems [0953149] Funding Source: National Science Foundation

Ask authors/readers for more resources

With the development of social media and social networks, user-generated content, such as forums, blogs and comments, are not only getting richer, but also ubiquitously interconnected with many other objects and entities, forming a heterogeneous information network between them. Sentiment analysis on such kinds of data can no longer ignore the information network, since it carries a lot of rich and valuable information, explicitly or implicitly, where some of them can be observed while others are not. However, most existing methods may heavily rely on the observed user-user friendship or similarity between objects, and can only handle a subgraph associated with a single topic. None of them takes into account the hidden and implicit dissimilarity, opposite opinions, and foe relationship. In this paper, we propose a novel information network-based framework which can infer hidden similarity and dissimilarity between users by exploring similar and opposite opinions, so as to improve post-level and user-level sentiment classification at the same time. More specifically, we develop a new meta path-based measure for inferring pseudo-friendship as well as dissimilarity between users, and propose a semi-supervised refining model by encoding similarity and dissimilarity from both user-level and post-level relations. We extensively evaluate the proposed approach and compare with several state-of-the-art techniques on two real-world forum datasets. Experimental results show that our proposed model with 10.5% labeled samples can achieve better performance than a traditional supervised model trained on 61.7% data samples. (C) 2014 Wiley Periodicals, Inc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available