4.7 Article

Deep unsupervised feature selection by discarding nuisance and correlated features

期刊

NEURAL NETWORKS
卷 152, 期 -, 页码 34-43

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2022.04.002

关键词

Unsupervised feature selection; Laplacian score; Concrete layer

资金

  1. NIH [R01RGM131642, UM1DA051410, U54AG076043, U01DA053628, R01GM135928, P50CA121974]

向作者/读者索取更多资源

Modern datasets often contain large subsets of correlated features and nuisance features. We propose an unsupervised feature selection method that avoids selecting nuisance features using the Laplacian score criterion, and we employ an autoencoder architecture to handle correlated features. Experimental results demonstrate the superiority of our approach.
Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set. To do this, we propose a fully differentiable approach for unsupervised feature selection, utilizing the Laplacian score criterion to avoid the selection of nuisance features. We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features. Building on the recently proposed concrete layer that allows controlling for the number of selected features via architectural design, simplifying the optimization process. Experimenting on several real-world datasets, we demonstrate that our proposed approach outperforms similar approaches designed to avoid only correlated or nuisance features, but not both. Several state-of-the-art clustering results are reported. (C) 2022 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据