☆ 4.7 Article

Ensemble-based hybrid probabilistic sampling for imbalanced data learning in lung nodule CAD

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS (2014)

Journal

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS

Volume 38, Issue 3, Pages 137-150

Publisher

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.compmedimag.2013.12.003

Keywords

Lung nodule detection; False positive reduction; Imbalanced data learning; Ensemble classifier; Re-sampling; Random subspace method

Categories

Engineering, Biomedical Radiology, Nuclear Medicine & Medical Imaging

Funding

Alberta Innovates Centre for Machine Learning
National Natural Science Foundation of China [61001047]
China Scholarship Council

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Classification plays a critical role in false positive reduction (FPR) in lung nodule computer aided detection (CAD). The difficulty of FPR lies in the variation of the appearances of the nodules, and the imbalance distribution between the nodule and non-nodule class. Moreover, the presence of inherent complex structures in data distribution, such as within-class imbalance and high-dimensionality are other critical factors of decreasing classification performance. To solve these challenges, we proposed a hybrid probabilistic sampling combined with diverse random subspace ensemble. Experimental results demonstrate the effectiveness of the proposed method in terms of geometric mean (G-mean) and area under the ROC curve (AUC) compared with commonly used methods. (C) 2013 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning

Fares Grina, Zied Elouedi, Eric Lefevre

Summary: Imbalanced classification refers to problems where there are significantly more instances for some classes than others. Traditional classifiers tend to be biased towards the majority class, so special attention is needed. This paper proposes a re-sampling approach based on belief function theory and ensemble learning to address class imbalance in the multi-class setting. The approach assigns soft evidential labels to each instance, selects ambiguous majority instances for undersampling, and oversamples minority objects through the generation of synthetic examples in borderline regions. It is incorporated into an evidential classifier-independent fusion-based ensemble, and comparison studies show its efficiency according to G-Mean and F1-score measures.

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

ECC plus plus : An algorithm family based on ensemble of classifier chains for classifying imbalanced multi-label data

Jicong Duan, Yan Gu, Hualong Yu, Xibei Yang, Shang Gao

Summary: Multi-label learning has a wide range of real-world applications and the problem of class imbalance in multilabel data has been less addressed. This study proposes the ECC++ algorithm family, which combines the ensemble classifier chain algorithm with binary-class imbalance learning techniques to tackle the challenges of class imbalance and label correlations. Experimental results demonstrate the effectiveness and superiority of ECC++ over existing class imbalance multi-label learning algorithms.

EXPERT SYSTEMS WITH APPLICATIONS (2024)

Add to Collection

Article Computer Science, Artificial Intelligence

Deterministic Sampling Classifier with weighted Bagging for drifted imbalanced data stream classification

Jakub Klikowski, Michal Wozniak

Summary: Streaming data classification is a critical task that deals with concept drift and imbalanced data. This paper proposes a novel algorithm that utilizes data preprocessing and weighted bagging technique to address these challenges, and experimental results demonstrate its effectiveness in various scenarios.

APPLIED SOFT COMPUTING (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Hybrid-ensemble-based interpretable TSK fuzzy classifier for imbalanced data

Zekang Bian, Jin Zhang, Yusuke Nojima, Fu-lai Chung, Shitong Wang

Summary: Due to its distinguished nonlinear mapping capability and interpretability, a novel hybrid-ensemble-based imbalanced interpretable TSK fuzzy classifier (HI-TSK-FC) is proposed in this study to achieve enhanced generalization and better interpretability. The HI-TSK-FC integrates an imbalanced global linear regression sub-classifier (IGLRc) and several imbalanced TSK fuzzy sub-classifiers (I-TSK-FCs). The training method of HI-TSK-FC, called imbalanced residual sketch learning (IRSL), is devised to share the virtues of both deep and wide learning.

INFORMATION FUSION (2023)

Add to Collection

Article Computer Science, Information Systems

Text length considered adaptive bagging ensemble learning algorithm for text classification

Youwei Wang, Jiangchun Liu, Lizhou Feng

Summary: Ensemble learning is widely used in text classification field to construct strong classifiers. A text length considered adaptive Bagging ensemble learning algorithm (TC_Bagging) is proposed to improve text classification accuracy. It compares different deep learning methods in processing long and short texts, constructs optimal base classifier groups, and uses an adaptive threshold group based random sampling method to train text sample subsets of different lengths. The algorithm combines the smooth inverse frequency based text vector generation algorithm with the traditional weighted voting classifier ensemble method to achieve better classification performance than baseline methods.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

Add to Collection

Article Computer Science, Information Systems

Evidential reasoning based ensemble classifier for uncertain imbalanced data

Chao Fu, Qianshan Zhan, Weiyong Liu

Summary: This study introduces an evidential reasoning-based ensemble classifier to handle uncertain and imbalanced data, filling a research gap in the field. By developing an oversampling technique and constructing ER-based classifiers, data uncertainty is effectively managed. Experimental comparisons with real and UCI datasets demonstrate the high performance of the proposed method.

INFORMATION SCIENCES (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Sparse projection infinite selection ensemble for imbalanced classification

Zhihan Ning, Zhixing Jiang, David Zhang

Summary: Imbalanced datasets pose frequent and challenging problems in real-world applications, where classification models tend to be biased towards the majority class. This paper proposes a novel framework called SPISE, which addresses the imbalanced learning problem by iteratively resampling balanced subsets and combining classifiers trained on these subsets. It takes into account the diversity of classifier ensembles and the similarity between subsets and the whole dataset.

KNOWLEDGE-BASED SYSTEMS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Equalization ensemble for large scale highly imbalanced data classification

Jinjun Ren, Yuping Wang, Mingqian Mao, Yiu-ming Cheung

Summary: The class-imbalance problem is common in various research fields. To tackle this problem, an equalization ensemble method (EASE) is proposed, which utilizes equalization under-sampling and weighted integration schemes. Experimental results demonstrate that EASE outperforms state-of-the-art methods on imbalanced data sets with different scales.

KNOWLEDGE-BASED SYSTEMS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Entropy-based hybrid sampling ensemble learning for imbalanced data

Li Dongdong, Chi Ziqiu, Wang Bolu, Wang Zhe, Yang Hai, Du Wenli

Summary: The paper proposes a hybrid sampling method based on information entropy to address the issues of important sample loss and overlapping in dealing with imbalanced data. The method retains all data in the training process, handles each data view with individual basic classifiers, and demonstrates effectiveness on real-world datasets through ensemble learning.

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS (2021)

Add to Collection

Article Computer Science, Information Systems

Efficient random subspace decision forests with a simple probability dimensionality setting scheme

Quan Wang, Fei Wang, Zhongheng Li, Peilin Jiang, Fuji Ren, Feiping Nie

Summary: This paper presents a novel framework called Efficient Random Subspace decision forest (ERS), which uses the HRDUVD method to determine the dimensionality of the random subspace. The ERS framework achieves effective and efficient decision forests in high dimensional cases.

INFORMATION SCIENCES (2023)

Add to Collection

Article Computer Science, Information Systems

KDE-Based Ensemble Learning for Imbalanced Data

Firuz Kamalov, Sherif Moussa, Jorge Avante Reyes

Summary: This paper proposes a novel ensemble classification method that deals with imbalanced data by training each tree in the ensemble using uniquely generated synthetically balanced data. Data balancing is achieved through kernel density estimation, resulting in a lower variance of the model estimator. The proposed classifier significantly outperforms benchmark methods in various datasets.

ELECTRONICS (2022)

Add to Collection

Article Computer Science, Information Systems

FastForest: Increasing random forest processing speed while maintaining accuracy

Darren Yates, Md Zahidul Islam

Summary: The FastForest algorithm, with its three optimizing components, achieves faster processing speed on hardware-constrained devices while maintaining high accuracy, suitable for both PC and smartphone platforms. Empirical testing shows excellent performance against other ensemble classifiers, surpassing them in various tests.

INFORMATION SCIENCES (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Neural random subspace

Yun-Hao Cao, Jianxin Wu, Hanchen Wang, Joan Lasenby

Summary: The paper introduces a novel deep learning based random subspace method, NRS, which outperforms traditional forest methods with better representation learning and higher accuracy, demonstrating superior performance on machine learning datasets and achieving improvements in image and point cloud recognition tasks.

PATTERN RECOGNITION (2021)

Add to Collection

Article Biology

Cross attention guided multi-scale feature fusion for false-positive reduction in pulmonary nodule detection

Zhongxuan Gu, Yueyang Li, Haichi Luo, Caidi Zhang, Hongqun Du

Summary: In this paper, a novel method for reducing false positives in pulmonary nodule detection is proposed. It achieves better feature extraction through multi-scale feature fusion and spatial pyramid pooling, and improves the model's generalization performance using a weighted fusion method. Extensive experiments demonstrate the effectiveness of the proposed method.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Add to Collection

Article Multidisciplinary Sciences

Dynamic learning for imbalanced data in learning chest X-ray and CT images

Saeed Iqbal, Adnan N. Qureshi, Jianqiang Li, Imran Arshad Choudhry, Tariq Mahmood

Summary: Massive annotated datasets are crucial for deep learning networks. Limited annotated datasets complicate the research of new topics, like viral epidemics. Furthermore, the unbalanced datasets in this scenario lack significant findings regarding the novel illness. Our proposed technique uses deep learning to train and evaluate chest X-ray and CT images, allowing for the detection of lung disease signs. By utilizing class balancing algorithms and imbalance-based sample analyzers, minority categories can be identified in the classification process. The technique achieves high accuracy and generalization, making it a potential tool for pathologists.

HELIYON (2023)

Add to Collection

No Data Available

No Data Available

© Peeref 2019-2024. All rights reserved.