4.2 Article

LAGOA: Learning automata based grasshopper optimization algorithm for feature selection in disease datasets

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s12652-021-03155-3

Keywords

Grasshopper optimization algorithm; Learning automata; Two-phase mutation; Biomedical data; Feature selection; Cancer data

Ask authors/readers for more resources

This paper emphasizes the importance of feature selection in predictive modeling, especially in disease datasets. The authors introduce a wrapper-based feature selection model and an improved Grasshopper Optimization Algorithm, LAGOA, which utilizes Learning Automata (LA) and two-phase mutation for enhancing algorithm performance.
In predictive modelling it is important to use any feature selection methods as irrelevant features when used with powerful classifiers can lead to over-fitting and thus create models which fail to perform as good as when these features are not used. Particularly it is important in case of disease datasets where various features or attributes are available through the patients' medical records and many features in these datasets may not be relevant to the diagnosis of some specific disease. Wrong models in this case can be disastrous and lead to wrong diagnosis, and maybe in extreme cases lead to loss of life. To this end, we have used a wrapper based feature selection model for the said purpose. In recent years, Grasshopper Optimization Algorithm (GOA) has proved its superiority over other optimization algorithms in different research areas. In this paper, we propose an improved version of GOA, called (LAGOA), which uses Learning Automata (LA) for adjusting the parameters of GOA in an adaptive way, and two-phase mutation for enhancing exploitation capability of the algorithm. LA is used for adjusting the parameter values of each grasshopper in the population individually. In two-phase mutation the first phase reduces the number of selected features while preserving high classification accuracy, while the second phase adds relevant features which increase the classification accuracy. Proposed method has been applied to Breast Cancer (Wisconsin), Breast Cancer (Diagnosis), Statlog (Heart), Lung Cancer, SpectF Heart and Hepatitis datasets taken from UCI Machine Learning Repository. Experimental results confirm its superiority over state-of-the-art methods considered here for comparison.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available