4.1 Article

Automatic topic identification of health-related messages in online health community using text classification

Journal

SPRINGERPLUS
Volume 2, Issue -, Pages -

Publisher

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1186/2193-1801-2-309

Keywords

Online health community; Text classification; Topic identification; Topic classification

Funding

  1. National Natural Science Foundation of China [71171131]

Ask authors/readers for more resources

To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naive Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naive Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available