4.8 Review

Stability of feature selection algorithm: A review

出版社

ELSEVIER
DOI: 10.1016/j.jksuci.2019.06.012

关键词

Feature selection; Knowledge discovery; Stability; Robustness; Instability; Perturbation

向作者/读者索取更多资源

Feature selection technique is a tool for understanding problems by analyzing relevant features, which can improve classifier performance and reduce computational load. However, the high correlation between features often leads to instability in traditional feature selection algorithms, resulting in reduced confidence in the selected features. Therefore, achieving high stability in feature selection algorithms is crucial.
Feature selection technique is a knowledge discovery tool which provides an understanding of the problem through the analysis of the most relevant features. Feature selection aims at building better classifier by listing significant features which also helps in reducing computational overload. Due to existing high throughput technologies and their recent advancements are resulting in high dimensional data due to which feature selection is being treated as handy and mandatory in such datasets. This actually questions the interpretability and stability of traditional feature selection algorithms. The high correlation in features frequently produces multiple equally optimal signatures, which makes traditional feature selection method unstable and thus leading to instability which reduces the confidence of selected features. Stability is the robustness of the feature preferences it produces to perturbation of training samples. Stability indicates the reproducibility power of the feature selection method. High stability of the feature selection algorithm is equally important as the high classification accuracy when evaluating feature selection performance. In this paper, we provide an overview of feature selection techniques and instability of the feature selection algorithm. We also present some of the solutions which can handle the different source of instability.(c) 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

High-dimensional microarray dataset classification using an improved adam optimizer (iAdam)

Utkarsh Mahadeo Khaire, R. Dhanalakshmi

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (2020)

Article Computer Science, Hardware & Architecture

Effects of Random Forest Parameters in the Selection of Biomarkers

Utkarsh Mahadeo Khaire, R. Dhanalakshmi

Summary: The microarray dataset covers almost every gene in the genome and helps with cancer diagnosis, prognosis, and treatment. The curse of dimensionality in microarray data hinders useful information and leads to computational instability. Feature selection and the random forest algorithm play a crucial role in extracting important features and reducing data dimensionality.

COMPUTER JOURNAL (2021)

Article Engineering, Electrical & Electronic

Stability Investigation of Improved Whale Optimization Algorithm in the Process of Feature Selection

Utkarsh Mahadeo Khaire, R. Dhanalakshmi

Summary: In this study, a new feature selection model based on improved WOA (iWOA) is proposed to select significant features from a high-dimensional microarray dataset. The stability of the results obtained is evaluated with the existing stability index that satisfies all the required characteristics of the stability measure.

IETE TECHNICAL REVIEW (2022)

Article Computer Science, Hardware & Architecture

Improved salp swarm algorithm based on the levy flight for feature selection

K. Balakrishnan, R. Dhanalakshmi, Utkarsh Mahadeo Khaire

Summary: The study introduces an enhanced version of Salp Swarm Algorithm (iSSA) which improves exploratory capabilities by randomizing location updates and using Levy flights to converge the model towards global optima. Experimental results show that iSSA outperforms SSA in six high-dimensional datasets, providing higher confidence in feature selection results.

JOURNAL OF SUPERCOMPUTING (2021)

Article Computer Science, Software Engineering

Excogitating marine predators algorithm based on random opposition-based learning for feature selection

Kulanthaivel Balakrishnan, Ramasamy Dhanalakshmi, Utkarsh Mahadeo Khaire

Summary: This research introduces the marine predators algorithm (MPA) and its improved version ROBL-MPA in handling high-dimensional datasets. ROBL-MPA outperforms traditional MPA in terms of performance.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE (2022)

Article Computer Science, Artificial Intelligence

A venture to analyse stable feature selection employing augmented marine predator algorithm based on opposition-based learning

Kulanthaivel Balakrishnan, Ramasamy Dhanalakshmi, Utkarsh Khaire

Summary: By improving the marine predator algorithm with opposition-based learning, stable feature selection is achieved in high-dimensional datasets, leading to enhanced classification accuracy. The proposed OBL-based marine predator algorithm demonstrates superior converging capacity, classification accuracy, and stable feature selection compared to conventional feature selection techniques.

EXPERT SYSTEMS (2022)

Article Computer Science, Artificial Intelligence

A novel control factor and Brownian motion-based improved Harris Hawks Optimization for feature selection

K. Balakrishnan, R. Dhanalakshmi, Utkarsh Mahadeo Khaire

Summary: The massive growth in data size has led to a proliferation of the need for feature selection methods. This research proposes an enhanced Harris Hawks Optimization algorithm for feature selection, which utilizes Brownian motion and a novel control factor to improve the search process. Experimental results demonstrate the superiority of this algorithm over existing techniques.

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (2022)

Article Computer Science, Artificial Intelligence

Instigating the Sailfish Optimization Algorithm Based on Opposition-Based Learning to Determine the Salient Features From a High-Dimensional Dataset

Utkarsh Mahadeo Khaire, R. Dhanalakshmi, K. Balakrishnan, M. Akila

Summary: This research proposes a hybrid combination of Opposition-Based Learning and Sailfish Optimization strategy to recognize salient features in high-dimensional datasets. The method improves exploration capability and convergence rate, achieving better classification accuracy compared to existing methods.

INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING (2023)

Article Engineering, Multidisciplinary

Feature Selection and Classification of Microarray Data for Cancer Prediction Using MapReduce Implementation of Random Forest Algorithm

R. Dhanalakshmi, Utkarsh M. Khaire

JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH (2019)

暂无数据