Article
Environmental Sciences
Wenguang Wang, Wenhong Wang, Hongfu Liu
Summary: In this paper, a correlation-guided ensemble clustering approach is proposed for hyperspectral band selection. By utilizing ensemble clustering and a consensus function, this approach can effectively select informative and representative bands.
Article
Computer Science, Artificial Intelligence
Peng Zhou, Xia Wang, Liang Du
Summary: Unsupervised feature selection is an important task in machine learning but suffers from stability and robustness issues due to the absence of labels. This paper proposes a novel bi-level feature selection ensemble method that not only ensembles at the feature level but also learns a consensus clustering result to guide the feature selection, outperforming other state-of-the-art methods.
INFORMATION FUSION
(2023)
Article
Computer Science, Artificial Intelligence
Guanghua Fu, Bencheng Li, Yongsheng Yang, Chaofeng Li
Summary: This paper proposes a four-stage ensemble feature selection method called RTEFS, which can reduce data dimensionality and improve the accuracy and computational cost of machine learning models. Experimental results show that RTEFS outperforms the base counterparts in terms of accuracy and F-measure scores.
PATTERN RECOGNITION LETTERS
(2023)
Review
Physics, Multidisciplinary
Malik Yousef, Abhishek Kumar, Burcu Bakir-Gungor
Summary: In the past two decades, advancements in high throughput technologies have led to exponential growth of gene expression datasets. Integrative approaches combining statistical metrics and biological knowledge are necessary for improving biomarker identification and potential treatment targets. These approaches are expected to enhance disease prediction, diagnosis, treatment, and understanding of disease dynamics.
Article
Computer Science, Information Systems
Mingzhao Wang, Henry Han, Zhao Huang, Juanying Xie
Summary: It is proposed in this paper to detect the informative features for high dimensional data with a small number of samples through two unsupervised spectral feature selection algorithms. These algorithms group features using an advanced Self-Tuning spectral clustering algorithm and detect the global optimal feature clusters through feature ranking techniques. Extensive experiments demonstrate the effectiveness of the proposed algorithms, especially the one based on cosine similarity feature ranking technique. The detected features have strong discriminative capabilities, making them suitable for building reliable and explainable AI systems, particularly in medical diagnostic systems.
FRONTIERS OF COMPUTER SCIENCE
(2023)
Article
Computer Science, Theory & Methods
Reem Salman, Ayman Alzaatreh, Hana Sulieman
Summary: This study explores the impact of different aggregation strategies on the stability and accuracy of ensemble feature selection, finding significant differences in the performance of ensembles under different aggregation methods, especially between score-based and rank-based aggregation strategies. Simple score-based strategies, such as Arithmetic Mean or L2-norm aggregation, appear to be efficient and compelling in most cases.
JOURNAL OF BIG DATA
(2022)
Article
Computer Science, Artificial Intelligence
Amin Hashemi, Mohammad Bagher Dowlatshahi, Hossein Nezamabadi-pour
Summary: In this paper, ensemble feature selection is modeled as a Multi-Criteria Decision-Making (MCDM) process, and a novel method called EFS-MCDM is proposed to rank and score features. Experimental results demonstrate that the proposed method outperforms other similar methods in terms of accuracy and efficiency.
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
(2022)
Review
Automation & Control Systems
Keyvan Golalipour, Ebrahim Akbari, Seyed Saeed Hamidi, Malrey Lee, Rasul Enayatifar
Summary: Clustering aims to discover natural groupings of patterns, points, or objects without a deterministic approach to decide the best method for a given set of input data. Clustering ensemble combines computed solutions of base clustering algorithms to achieve stability and robustness, while clustering ensemble selection chooses a subset of base clustering based on quality and diversity for better performance. This survey covers the historical development of data clustering, basic clustering techniques, clustering ensemble algorithms, and clustering ensemble selection techniques for improved quality and diversity.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2021)
Article
Physics, Multidisciplinary
Reem Salman, Ayman Alzaatreh, Hana Sulieman, Shaimaa Faisal
Summary: This study implemented a general framework for the ensemble of multiple feature selection methods, which aggregates importance scores generated by different selection methods to resolve inconsistency issues and control the diversity of selected feature subsets. Experimental results showed that the Within Aggregation Method (WAM) is more stable in identifying important features compared to the Between Aggregation Method (BAM), providing an effective tool for determining the best feature selection method for a given dataset. By applying both WAM and BAM, practitioners can gain a deeper understanding of the feature selection process.
Article
Energy & Fuels
Jose Ortiz-Bejar, Alejandro Zamora-Mendez, Lucas Lugnani, Eric Tellez, Mario R. Arrieta Paternina
Summary: This paper assesses the coherency in power systems using the affinity propagation (AP) algorithm with different distance metrics and quality measurements. The AP method is adopted to identify and distinguish coherent patterns in a power system, and three different distance metrics are evaluated to determine their impact on the clustering quality. The experimental results demonstrate the effectiveness of the proposed strategy in identifying coherent patterns in large-scale power systems.
SUSTAINABLE ENERGY GRIDS & NETWORKS
(2022)
Article
Computer Science, Artificial Intelligence
Renato Cordeiro de Amorim, Vladimir Makarenkov
Summary: This article explores the relationship between the convergence iteration number (τ) of the k-means algorithm and the structure and clustering quality of the data set. It demonstrates that τ can be used to identify irrelevant features, improve feature selection algorithms, and determine the true number of clusters in a data set.
Article
Computer Science, Artificial Intelligence
Saptarshi Chakraborty, Swagatam Das
Summary: In this paper, a simple and efficient sparse clustering algorithm called LW-k-means is proposed for high-dimensional data. The algorithm incorporates feature weighting to enable feature selection and has a time complexity similar to traditional algorithms. The strong consistency of the LW-k-means procedure is also established. Experimental results on synthetic and real-life datasets demonstrate that LW-k-means performs competitively in terms of clustering accuracy and computational time compared to existing methods for center-based high-dimensional clustering.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Abdolreza Rashno, Milad Shafipour, Sadegh Fadaei
Summary: This paper introduces a novel multi-objective particle swarm optimization feature selection method. It decodes feature vectors as particles and ranks them in a two-dimensional optimization space. The proposed method incorporates feature ranks to update particle velocity and position during the optimization process. Experimental results demonstrate the effectiveness of the method in finding Pareto Fronts of the best particles in multi-objective optimization space.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Vahid Nosrati, Mohsen Rahmani
Summary: In this paper, the authors enhance the diversity paradigm in ensemble feature selection models by applying recursive balanced partitioning (RBP) approach. They propose a new diversity measurement and aggregation criterion. Experimental results demonstrate that the proposed RBP method outperforms the traditional random partitioning in terms of diversity achievement. Furthermore, the study shows a positive relationship between diversity and classification accuracy.
NEURAL COMPUTING & APPLICATIONS
(2023)
Article
Biochemical Research Methods
Xudong Zhao, Jingwen Zhai, Tong Liu, Guohua Wang
Summary: This paper proposes an improved variable selection framework for identifying plant PPR proteins. The improvements include the use of a hybrid ensemble classifier and alternating feature selection strategy, and it is found that different base classifiers play an important role as the feature dimension increases. Experimental results demonstrate the effectiveness of the improvements.
BRIEFINGS IN BIOINFORMATICS
(2022)