期刊
PATTERN RECOGNITION
卷 61, 期 -, 页码 511-523出版社
ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2016.08.011
关键词
Feature selection; Markov blanket; Information theory; Semi-supervised learning; Representative features
资金
- National Natural Science Foundation of China [61139002, 61501229, 11547040]
- Guangdong Province Natural Science Foundation [2016A030310051, 2015KONCX143]
- Shenzhen Fundamental Research Fundation [JCYJ20150625101524056]
- SZU R/D Fund [2016047]
- SKL-MCCS
Feature selection (FS) plays an important role in data mining and recognition, especially regarding large scale text, images and biological data. The Markov blanket provides a complete and sound solution to the selection of optimal features in supervised feature selection, and investigates thoroughly the relevance of features relating to class and the conditional independence relationship between features. However, incomplete label information makes it particularly difficult to acquire the optimal feature subset. In this paper, we propose a novel algorithm called the Semi-supervised Representatives Feature Selection algorithm based on information theory (SRFS), which is independent of any algorithm used for classification learning, and can rapidly and effectively identify and remove non-essential information and irrelevant and redundant features. More importantly, the unlabeled data are utilized in the Markov blanket as the labeled data through the relevance gain. Our results on several benchmark datasets demonstrate that SRFS can significantly improve upon state of the art supervised and semi-supervised algorithms. (C) 2016 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据