Article
Computer Science, Artificial Intelligence
Fares Grina, Zied Elouedi, Eric Lefevre
Summary: Imbalanced classification refers to problems where there are significantly more instances for some classes than others. Traditional classifiers tend to be biased towards the majority class, so special attention is needed. This paper proposes a re-sampling approach based on belief function theory and ensemble learning to address class imbalance in the multi-class setting. The approach assigns soft evidential labels to each instance, selects ambiguous majority instances for undersampling, and oversamples minority objects through the generation of synthetic examples in borderline regions. It is incorporated into an evidential classifier-independent fusion-based ensemble, and comparison studies show its efficiency according to G-Mean and F1-score measures.
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
(2023)
Article
Computer Science, Artificial Intelligence
Jicong Duan, Yan Gu, Hualong Yu, Xibei Yang, Shang Gao
Summary: Multi-label learning has a wide range of real-world applications and the problem of class imbalance in multilabel data has been less addressed. This study proposes the ECC++ algorithm family, which combines the ensemble classifier chain algorithm with binary-class imbalance learning techniques to tackle the challenges of class imbalance and label correlations. Experimental results demonstrate the effectiveness and superiority of ECC++ over existing class imbalance multi-label learning algorithms.
EXPERT SYSTEMS WITH APPLICATIONS
(2024)
Article
Computer Science, Artificial Intelligence
Jakub Klikowski, Michal Wozniak
Summary: Streaming data classification is a critical task that deals with concept drift and imbalanced data. This paper proposes a novel algorithm that utilizes data preprocessing and weighted bagging technique to address these challenges, and experimental results demonstrate its effectiveness in various scenarios.
APPLIED SOFT COMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Zekang Bian, Jin Zhang, Yusuke Nojima, Fu-lai Chung, Shitong Wang
Summary: Due to its distinguished nonlinear mapping capability and interpretability, a novel hybrid-ensemble-based imbalanced interpretable TSK fuzzy classifier (HI-TSK-FC) is proposed in this study to achieve enhanced generalization and better interpretability. The HI-TSK-FC integrates an imbalanced global linear regression sub-classifier (IGLRc) and several imbalanced TSK fuzzy sub-classifiers (I-TSK-FCs). The training method of HI-TSK-FC, called imbalanced residual sketch learning (IRSL), is devised to share the virtues of both deep and wide learning.
INFORMATION FUSION
(2023)
Article
Computer Science, Information Systems
Youwei Wang, Jiangchun Liu, Lizhou Feng
Summary: Ensemble learning is widely used in text classification field to construct strong classifiers. A text length considered adaptive Bagging ensemble learning algorithm (TC_Bagging) is proposed to improve text classification accuracy. It compares different deep learning methods in processing long and short texts, constructs optimal base classifier groups, and uses an adaptive threshold group based random sampling method to train text sample subsets of different lengths. The algorithm combines the smooth inverse frequency based text vector generation algorithm with the traditional weighted voting classifier ensemble method to achieve better classification performance than baseline methods.
MULTIMEDIA TOOLS AND APPLICATIONS
(2023)
Article
Computer Science, Information Systems
Chao Fu, Qianshan Zhan, Weiyong Liu
Summary: This study introduces an evidential reasoning-based ensemble classifier to handle uncertain and imbalanced data, filling a research gap in the field. By developing an oversampling technique and constructing ER-based classifiers, data uncertainty is effectively managed. Experimental comparisons with real and UCI datasets demonstrate the high performance of the proposed method.
INFORMATION SCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Zhihan Ning, Zhixing Jiang, David Zhang
Summary: Imbalanced datasets pose frequent and challenging problems in real-world applications, where classification models tend to be biased towards the majority class. This paper proposes a novel framework called SPISE, which addresses the imbalanced learning problem by iteratively resampling balanced subsets and combining classifiers trained on these subsets. It takes into account the diversity of classifier ensembles and the similarity between subsets and the whole dataset.
KNOWLEDGE-BASED SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Jinjun Ren, Yuping Wang, Mingqian Mao, Yiu-ming Cheung
Summary: The class-imbalance problem is common in various research fields. To tackle this problem, an equalization ensemble method (EASE) is proposed, which utilizes equalization under-sampling and weighted integration schemes. Experimental results demonstrate that EASE outperforms state-of-the-art methods on imbalanced data sets with different scales.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Li Dongdong, Chi Ziqiu, Wang Bolu, Wang Zhe, Yang Hai, Du Wenli
Summary: The paper proposes a hybrid sampling method based on information entropy to address the issues of important sample loss and overlapping in dealing with imbalanced data. The method retains all data in the training process, handles each data view with individual basic classifiers, and demonstrates effectiveness on real-world datasets through ensemble learning.
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
(2021)
Article
Computer Science, Information Systems
Quan Wang, Fei Wang, Zhongheng Li, Peilin Jiang, Fuji Ren, Feiping Nie
Summary: This paper presents a novel framework called Efficient Random Subspace decision forest (ERS), which uses the HRDUVD method to determine the dimensionality of the random subspace. The ERS framework achieves effective and efficient decision forests in high dimensional cases.
INFORMATION SCIENCES
(2023)
Article
Computer Science, Information Systems
Firuz Kamalov, Sherif Moussa, Jorge Avante Reyes
Summary: This paper proposes a novel ensemble classification method that deals with imbalanced data by training each tree in the ensemble using uniquely generated synthetically balanced data. Data balancing is achieved through kernel density estimation, resulting in a lower variance of the model estimator. The proposed classifier significantly outperforms benchmark methods in various datasets.
Article
Computer Science, Information Systems
Darren Yates, Md Zahidul Islam
Summary: The FastForest algorithm, with its three optimizing components, achieves faster processing speed on hardware-constrained devices while maintaining high accuracy, suitable for both PC and smartphone platforms. Empirical testing shows excellent performance against other ensemble classifiers, surpassing them in various tests.
INFORMATION SCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Yun-Hao Cao, Jianxin Wu, Hanchen Wang, Joan Lasenby
Summary: The paper introduces a novel deep learning based random subspace method, NRS, which outperforms traditional forest methods with better representation learning and higher accuracy, demonstrating superior performance on machine learning datasets and achieving improvements in image and point cloud recognition tasks.
PATTERN RECOGNITION
(2021)
Article
Biology
Zhongxuan Gu, Yueyang Li, Haichi Luo, Caidi Zhang, Hongqun Du
Summary: In this paper, a novel method for reducing false positives in pulmonary nodule detection is proposed. It achieves better feature extraction through multi-scale feature fusion and spatial pyramid pooling, and improves the model's generalization performance using a weighted fusion method. Extensive experiments demonstrate the effectiveness of the proposed method.
COMPUTERS IN BIOLOGY AND MEDICINE
(2022)
Article
Multidisciplinary Sciences
Saeed Iqbal, Adnan N. Qureshi, Jianqiang Li, Imran Arshad Choudhry, Tariq Mahmood
Summary: Massive annotated datasets are crucial for deep learning networks. Limited annotated datasets complicate the research of new topics, like viral epidemics. Furthermore, the unbalanced datasets in this scenario lack significant findings regarding the novel illness. Our proposed technique uses deep learning to train and evaluate chest X-ray and CT images, allowing for the detection of lung disease signs. By utilizing class balancing algorithms and imbalance-based sample analyzers, minority categories can be identified in the classification process. The technique achieves high accuracy and generalization, making it a potential tool for pathologists.