☆ 4.2 Article

SYMBOLIC ONE-CLASS LEARNING FROM IMBALANCED DATASETS: APPLICATION IN MEDICAL DIAGNOSIS

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS (2009)

Journal

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS

Volume 18, Issue 2, Pages 273-309

Publisher

WORLD SCIENTIFIC PUBL CO PTE LTD

DOI: 10.1142/S0218213009000135

Keywords

Machine learning; imbalanced datasets; one-class learning; classification algorithm; rule extraction

Categories

Computer Science, Artificial Intelligence Computer Science, Interdisciplinary Applications

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

When working with real-world applications we often find imbalanced datasets, those for which there exists a majority class with normal data and a minority class with abnormal or important data. In this work, we make an overview of the class imbalance problem; we review consequences, possible causes and existing strategies to cope with the inconveniences associated to this problem. As an effort to contribute to the solution of this problem, we propose a new rule induction algorithm named Rule Extraction for MEdical Diagnosis (REMED), as a symbolic one-class learning approach. For the evaluation of the proposed method, we use different medical diagnosis datasets taking into account quantitative metrics, comprehensibility, and reliability. We performed a comparison of REMED versus C4.5 and RIPPER combined with over-sampling and cost-sensitive strategies. This empirical analysis of the REMED algorithm showed it to be quantitatively competitive with C4.5 and RIPPER in terms of the area under the Receiver Operating Characteristic curve (AUC) and the geometric mean, but overcame them in terms of comprehensibility and reliability. Results of our experiments show that REMED generated rules systems with a larger degree of abstraction and patterns closer to well-known abnormal values associated to each considered medical dataset.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Computer Science, Information Systems

Feature Selection and Ensemble Learning Techniques in One-Class Classifiers: An Empirical Study of Two-Class Imbalanced Datasets

Chih-Fong Tsai, Wei-Chao Lin

Summary: The study treats class imbalance as anomaly detection problem, investigating the performance of OCC classifiers and their performance in ensemble learning. Results show that OCC classifiers perform well on datasets with high class imbalance ratios, but feature selection does not usually improve their performance, while combining multiple OCC classifiers can outperform individual classifiers.

IEEE ACCESS (2021)

Add to Collection

Article Health Care Sciences & Services

Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques

Vinod Kumar, Gotam Singh Lalotra, Ponnusamy Sasikala, Dharmendra Singh Rajput, Rajesh Kaluri, Kuruva Lakshmanna, Mohammad Shorfuzzaman, Abdulmajeed Alsufyani, Mueen Uddin

Summary: Healthcare is crucial for every individual, and clinical datasets play a significant role in developing intelligent healthcare systems. However, class imbalance in real-world datasets poses challenges in training classifiers. This study evaluates the performance of six classifiers on five imbalanced clinical datasets and explores different class balancing techniques. The results demonstrate the superiority of the SMOTEEN method among all the tested techniques.

HEALTHCARE (2022)

Add to Collection

Article Biology

Diabetic retinopathy screening using deep learning for multi-class imbalanced datasets

Manisha Saini, Seba Susan

Summary: Screening and diagnosis of diabetic retinopathy disease is a significant problem in the biomedical domain. The use of medical imagery from a patient's eye for computer-aided diagnosis has greatly advanced with the success of deep learning. However, challenges with imbalanced datasets, inconsistent annotations, limited samples, and inappropriate evaluation metrics have impacted the performance of deep learning models.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Minimum class variance class-specific extreme learning machine for imbalanced classification

Bhagat Singh Raghuwanshi, Sanyam Shukla

Summary: This paper introduces a new variant of extreme learning machine, MCVCSELM, for effectively addressing binary class imbalance problems by utilizing minimum class variance and class-specific regularization. Experimental results demonstrate that the proposed algorithm outperforms several state-of-the-art methods for imbalanced learning.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

CDSMOTE: class decomposition and synthetic minority class oversampling technique for imbalanced-data classification

Eyad Elyan, Carlos Francisco Moreno-Garcia, Chrisina Jayne

Summary: Class-imbalanced datasets are common in various domains, and using class decomposition and oversampling methods can effectively reduce the dominance of majority class instances. Experimental results demonstrate the effectiveness and superiority of the proposed hybrid approach in addressing class imbalance.

NEURAL COMPUTING & APPLICATIONS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Multi-class imbalanced big data classification on Spark

William C. Sleeman, Bartosz Krawczyk

Summary: This paper proposes a compound framework for dealing with multi-class big data problems, addressing the existence of multiple classes and high volumes of data simultaneously. By analyzing instance-level difficulties in each class and embedding this information in popular resampling algorithms, informative balancing of multiple classes is achieved. Extensive experimental study shows that using instance-level information significantly improves learning from multi-class imbalanced big data.

KNOWLEDGE-BASED SYSTEMS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

KNNOR: An oversampling technique for imbalanced datasets

Ashhadul Islam, Samir Brahim Belhaouari, Atiq Ur Rehman, Halima Bensmail

Summary: This study introduces an advanced algorithm called KNNOR to address class imbalance by studying the compactness and location of the minority class, identifying critical and safe areas for augmentation, and generating synthetic data points. Experimental results show that the proposed method consistently outperforms other state-of-the-art oversamplers on several common imbalanced datasets, making it easy to use and open source as a python library.

APPLIED SOFT COMPUTING (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Towards hybrid over- and under-sampling combination methods for class imbalanced datasets: an experimental study

Cian Lin, Chih-Fong Tsai, Wei-Chao Lin

Summary: For imbalanced domain datasets, data re-sampling techniques such as under-sampling the majority class and over-sampling the minority class are often used to construct effective models. This study aims to determine the better order of combining under- and over-sampling methods. Experiment results show that if the under-sampling algorithm (IB3) is carefully chosen, further addition of the over-sampling step does not significantly improve performance. Moreover, performing instance selection first and over-sampling second with the IB3 algorithm yields the best results.

ARTIFICIAL INTELLIGENCE REVIEW (2023)

Add to Collection

Article Biochemical Research Methods

Identifying anti-coronavirus peptides by incorporating different negative datasets and imbalanced learning strategies

Yuxuan Pang, Zhuo Wang, Jhih-Hua Jhong, Tzong-Yi Lee

Summary: The study investigated the physiochemical properties of anti-coronavirus peptides and established a classifier for identification. Imbalanced learning strategies were adopted to address the class-imbalance issue. A double-stages classifier was designed to identify anti-CoV peptides, showing promising results for assisting in the development of novel anti-CoV peptides.

BRIEFINGS IN BIOINFORMATICS (2021)

Add to Collection

Article Chemistry, Multidisciplinary

Class Imbalanced Medical Image Classification Based on Semi-Supervised Federated Learning

Wei Liu, Jiaqing Mo, Furu Zhong

Summary: In this paper, a federated learning method combining regularization constraints and pseudo-label construction is proposed to address the issues of insufficient labeled data and model degradation in the case of data category imbalance. Experimental results show that the method improves the AUC by 7.35% and the average class sensitivity by 1.34% compared to state-of-the-art methods, indicating its strong learning capability on unbalanced datasets with small batches.

APPLIED SCIENCES-BASEL (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Deep learning approach for defective spot welds classification using small and class-imbalanced datasets

Wei Dai, Dayong Li, Ding Tang, Huamiao Wang, Yinghong Peng

Summary: Availability of large-scale annotated and class-balanced datasets is crucial for deep learning based computer vision tasks. This study proposes a framework for improving spot welding defects classification performance by using GAN-based data augmentation, including the generation of diverse minority-class images using balancing GAN and gradient penalty, and enhancing the classifier by adding the generated images to the training dataset.

NEUROCOMPUTING (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Hierarchical One-Class Classifier With Within-Class Scatter-Based Autoencoders

Tianlei Wang, Jiuwen Cao, Xiaoping Lai, Q. M. Jonathan Wu

Summary: Autoencoding is an important branch of representation learning in deep neural networks, and the newly developed WSI-AE and OCC algorithm based on it were experimentally proven to be effective in comparison with state-of-the-art AEs and OCC algorithms.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021)

Add to Collection

Article Mathematics

Cost-Sensitive Variable Selection for Multi-Class Imbalanced Datasets Using Bayesian Networks

Dario Ramos-Lopez, Ana D. Maldonado

Summary: Multi-class classification in imbalanced datasets presents a challenging problem where traditional validation metrics may not be suitable. A cost-sensitive variable selection procedure is proposed to build a Bayesian network classifier, optimizing a specified cost function. Fine-tuning the objective validation function can improve prediction quality in imbalanced data or when considering asymmetric misclassification costs.

MATHEMATICS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Automated imbalanced classification via meta-learning

Nuno Moniz, Vitor Cerqueira

Summary: In this paper, an Automated Imbalanced Classification method, ATOMIC, is proposed for imbalanced classification tasks. ATOMIC provides a ranking of solutions most likely to ensure an optimal approximation to a new domain, drastically reducing associated computational complexity and energy consumption by anticipating the loss of a large set of predictive solutions in new imbalanced learning tasks. Results demonstrate that the proposed method provides a relevant approach to imbalanced learning while reducing learning and testing efforts of candidate solutions by approximately 95%.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

Add to Collection

Article Multidisciplinary Sciences

An oversampling method for multi-class imbalanced data based on composite weights

Mingyang Deng, Yingshi Guo, Chang Wang, Fuwei Wu

Summary: The proposed oversampling method based on classification ranking and weight setting effectively addresses the oversampling problem of multi-class small samples, achieving balanced data distribution while maintaining the properties of the original samples. Compared to other algorithms, it also shows higher classification accuracy of around 90%, demonstrating practicality and generality for imbalanced multi-class samples.

PLOS ONE (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Acute leukemia classification by ensemble particle swarm model selection

Hugo Jair Escalante, Manuel Montes-y-Gomez, Jesus A. Gonzalez, Pilar Gomez-Gil, Leopoldo Altamirano, Carlos A. Reyes, Carolina Reta, Alejandro Rosales

ARTIFICIAL INTELLIGENCE IN MEDICINE (2012)

Add to Collection

Article Computer Science, Artificial Intelligence

The segmented and annotated IAPR TC-12 benchmark

Hugo Jair Escalante, Carlos A. Hernandez, Jesus A. Gonzalez, A. Lopez-Lopez, Manuel Montes, Eduardo F. Morales, L. Enrique Sucar, Luis Villasenor, Michael Grubinger

COMPUTER VISION AND IMAGE UNDERSTANDING (2010)

Add to Collection

Article Astronomy & Astrophysics

Automatic stellar spectral classification via sparse representations and dictionary learning

R. Diaz-Hernandez, H. Peregrina-Barreto, L. Altamirano-Robles, J. A. Gonzalez-Bernal, A. E. Ortiz-Esquivel

EXPERIMENTAL ASTRONOMY (2014)

Add to Collection

Article Computer Science, Artificial Intelligence

Extracting new patterns for cardiovascular disease prognosis

Luis Mena, Jesus A. Gonzalez, Gladys Maestre

EXPERT SYSTEMS (2009)

Add to Collection

Article Computer Science, Artificial Intelligence

Multi-objective model type selection

Alejandro Rosales-Perez, Jesus A. Gonzalez, Carlos A. Coello Coello, Hugo Jair Escalante, Carlos A. Reyes-Garcia

NEUROCOMPUTING (2014)

Add to Collection

Proceedings Paper Computer Science, Artificial Intelligence

An Evolutionary Multi-Objective Approach for Prototype Generation

Alejandro Rosales-Perez, Hugo Jair Escalante, Carlos A. Coello Coello, Jesus A. Gonzalez, Carlos A. Reyes-Garcia

2014 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) (2014)

Add to Collection

Article Computer Science, Artificial Intelligence

Transition regions detection from satellite images based on evolutionary region growing segmentation

Jorge Morales, Jesus A. Gonzalez, Carlos A. Reyes-Garcia, Leopoldo Altamirano

INTELLIGENT DATA ANALYSIS (2014)

Add to Collection

Proceedings Paper Computer Science, Artificial Intelligence

Genetic Selection of Fuzzy Model for Acute Leukemia Classification

Alejandro Rosales-Perez, Carlos A. Reyes-Garcia, Pilar Gomez-Gil, Jesus A. Gonzalez, Leopoldo Altamirano

ADVANCES IN ARTIFICIAL INTELLIGENCE, PT I (2011)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.