4.7 Article

Sparse feature selection: Relevance, redundancy and locality structure preserving guided by pairwise constraints

期刊

APPLIED SOFT COMPUTING
卷 87, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.asoc.2019.105956

关键词

Sparse feature selection; l(1)-norm; Pairwise redundancy; Graph Laplacian; Locality structure preserving; Pairwise constraints

向作者/读者索取更多资源

Selection of features as a pre-processing stage is an essential issue in many machine learning tasks (such as classification) to reduce data dimensionality as there are many irrelevant and redundant features that can mislead the learning process. Graph-based sparse feature selection is developed to overcome this issue. In this paper, a novel graph-based sparse feature selection method is proposed that take into account both issues: relevancy and redundancy analysis. An empirical loss function joining with l(1)-norm regularization term is proposed to overcome the relevancy issue and the redundancy issue is overcome by introducing a regularization term that prefers uncorrelated features. Furthermore, the proposed learning procedure is guided by two different sets of supervision information as pairs of must-linked (positive) and cannot-linked (negative) constraint sets to select a discriminative feature subset. These guiding information besides the whole data points are encoded in the graph Laplacian matrix that preserves the locality structure of the original data. The graph Laplacian matrix is constructed by two different approaches. Our first approach tries to preserve the structure of the original data guided just by the positive data points (unique samples in the must-linked constraints), and our second approach applies a normalized adapted affinity matrix to embed the pairwise must-linked and cannot-linked constraints as well as the neighborhood relationships information, all together. The experimental results on a number of several datasets from the University of California-Irvine machine learning repository, in addition to several high dimensional gene expression datasets show the efficacy of the proposed methods in the classification tasks compared to several powerful feature selection methods. (C) 2019 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据