4.6 Article

Rough set methods in feature selection via submodular function

Journal

SOFT COMPUTING
Volume 21, Issue 13, Pages 3699-3711

Publisher

SPRINGER
DOI: 10.1007/s00500-015-2024-7

Keywords

Attribute reduction; Granular computing; Mutual information; Rough set; Submodular function

Funding

  1. National Natural Science Foundation of China [61379049]

Ask authors/readers for more resources

Attribute reduction is an important problem in data mining and machine learning in that it can highlight favorable features and decrease the risk of over-fitting to improve the learning performance. With this regard, rough sets offer interesting opportunities for this problem. Reduct in rough sets is a subspace of attributes/features which are jointly sufficient and individually necessary to satisfy a certain criterion. Excessive attributes may reduce diversity and increase correlation among features, a lower number of attributes may also receive nearly equal to or even higher classification accuracy in some specific classifiers, which motivates us to address dimensionality reduction problems with attribute reduction from the joint viewpoint of the learning performance and the reduct size. In this paper, we propose a new attribute reduction criterion to select lowest attributes while keeping the best performance of the corresponding learning algorithms to some extent. The main contributions of this work are twofold. First, we define the concept of k-approximate-reduct, instead of the limitation to minimum reduct, which provides an important view to reveal the connection between the size of attribute reduct and the learning performance. Second, a greedy algorithm for attribute reduction problems based on mutual information is developed, and submodular functions are used to analyze its convergence. By the property of diminishing return of the submodularity, there is a solid guarantee for the reasonability of the k-approximate-reduct. It is noted that rough sets serve as an effective tool to evaluate both the marginal and joint probability distributions among attributes in mutual information. Extensive experiments in six real-world public datasets from machine learning repository demonstrate that the selected subset by mutual information reduct comes with higher accuracy with less number of attributes when developing classifiers naive Bayes and radial basis function network.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available