4.6 Article

Feature selection by using chaotic cuckoo optimization algorithm with levy flight, opposition-based learning and disruption operator

Journal

SOFT COMPUTING
Volume 25, Issue 4, Pages 2911-2933

Publisher

SPRINGER
DOI: 10.1007/s00500-020-05349-x

Keywords

Feature selection; High-dimensional data; Cuckoo optimization algorithm; Chaotic theory; Levy flight; Disruption operator; Opposition-based learning

Ask authors/readers for more resources

This study proposed a new chaotic cuckoo optimization algorithm combined with various operators to select the optimal feature subspace for data classification, avoiding local optimum solutions and enhancing the interestingness of feature subsets. Extensive experiments on 20 high-dimensional datasets showed the method's superiority in classification accuracy rate and its ability to select the most relevant features.
Feature selection, which plays an important role in high-dimensional data analysis, is drawing increasing attention recently. Finding the most relevant and important features for classifications are one of the most important tasks of data mining and machine learning, since all of the datasets have irrelevant features that affect accuracy rate and slow down the classifier. Feature selection is an optimization process, which improves the accuracy rate of data classification and reduces the number of selected features. Applying too many features both requires a large memory capacity and leads to a slow execution speed. Feature selection algorithms are often responsible to decide which features should be selected to be used during a classification algorithm. Traditional algorithms seemed to be inefficient due to the complexity of dimensions of the problem, thus evolutionary algorithms were used to improve the problem solving process. The algorithm proposed in this paper, chaotic cuckoo optimization algorithm with levy flight, disruption operator and opposition-based learning (CCOALFDO), is applied to select the optimal feature subspace for classification. It reduces the randomization in selecting features and avoids getting stuck in local optimum solutions which lead to a more interesting feature subset. Extensive experiments are conducted on 20 high-dimensional datasets to demonstrate the effectiveness and efficiency of the proposed method. The results showed the superiority of the proposed method to state-of-the-art methods in terms of classification accuracy rate. In addition, they prove the ability of the CCOALFDO in selecting the most relevant features for classification tasks. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification accuracy rate in real world datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available