4.7 Article

Top-k Feature Selection Framework Using Robust 0-1 Integer Programming

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.3009209

关键词

0-1 integer programming; feature selection (FS); l(0,2)-norm; nonconvex optimization.

资金

  1. National Natural Science Foundation of China [61772373, 61922064, 61772374, 61806003]
  2. Zhejiang Provincial Natural Science Foundation [LR17F030001]
  3. Project of Science and Technology Plans of Wenzhou City [C20170008, ZG2017016]
  4. Australian Research Council [FL-170100117]

向作者/读者索取更多资源

The article presents a novel feature selection framework to select the exact top-k features by utilizing the l(0,2)-norm as the matrix sparsity constraint. The difficult l(0,2)-norm constrained problem is transformed into an equivalent 0-1 integer constraint and replaced with two continuous constraints. The resulting framework is theoretically equivalent to the l(0,2)-norm constrained problem and can be optimized using the alternating direction method of multipliers (ADMM).
Feature selection (FS), which identifies the relevant features in a data set to facilitate subsequent data analysis, is a fundamental problem in machine learning and has been widely studied in recent years. Most FS methods rank the features in order of their scores based on a specific criterion and then select the k top-ranked features, where k is the number of desired features. However, these features are usually not the top-k features and may present a suboptimal choice. To address this issue, we propose a novel FS framework in this article to select the exact top-k features in the unsupervised, semisupervised, and supervised scenarios. The new framework utilizes the l(0,2)-norm as the matrix sparsity constraint rather than its relaxations, such as the l(1,2-)norm. Since the l(0,2)-norm constrained problem is difficult to solve, we transform the discrete l(0,2)-norm-based constraint into an equivalent 0-1 integer constraint and replace the 0-1 integer constraint with two continuous constraints. The obtained top-k FS framework with two continuous constraints is theoretically equivalent to the l(0,2)-norm constrained problem and can be optimized by the alternating direction method of multipliers (ADMM). Unsupervised and semisupervised FS methods are developed based on the proposed framework, and extensive experiments on real-world data sets are conducted to demonstrate the effectiveness of the proposed FS framework.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据