4.7 Article

Dimensionality reduction by feature clustering for regression problems

Journal

INFORMATION SCIENCES
Volume 299, Issue -, Pages 42-57

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2014.12.003

Keywords

Machine learning; Feature clustering; Feature extraction; Correlation coefficient; Mutual information

Funding

  1. National Science Council [NSC-102-2221-E-110-070, NSC-102-2622-E-110-004-CC3]
  2. Aim for the Top University Plan of the National Sun Yat-Sen University and Ministry of Education

Ask authors/readers for more resources

One of the issues encountered in classification and regression is the processing inefficiency caused by a large number of input dimensions involved in the given training data set. Many dimensionality reduction approaches have been proposed to address this issue by reducing the number of input dimensions and maintaining the generalization capability of the original data set. However, less attention has been paid to regression than to classification. Besides, the computation with covariance matrices involved results in an inefficient reduction process in most existing methods. In this paper, we propose a machine learning based dimensionality reduction approach for regression problems. For a given set of training instances, a group of clusters are formed such that the instances included in the same cluster are similar to each other. Then one new feature is extracted from each cluster through a certain weighted combination of the training instances. Consequently, the dimensionality of the original data set is reduced. The clusters are created incrementally and automatically without the need of specifying the number of clusters in advance by the user. The characteristics of the original data set are substantially retained since all the original features are involved in the derivation of the extracted features. Also, the computation with covariance matrices is avoided, and thus efficiency is maintained. A number of experiments on real-world data sets are conducted to demonstrate the effectiveness of the proposed approach. (C) 2014 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available