4.7 Article

An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling

Journal

EXPERT SYSTEMS WITH APPLICATIONS
Volume 160, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.113660

Keywords

Imbalanced classification; Ensemble learning; Data partition hybrid sampling; Model dynamic selection

Funding

  1. Science and Technology Project of SGCC [SGNR0000KJJS1802828]

Ask authors/readers for more resources

In many real-world applications classification problems suffer from class-imbalance. The classification methods for imbalanced data with only data processing or algorithm improvement cannot get satisfied classification performance of the minority class. This paper proposes an ensemble classification method based on model dynamic selection driven by data partition hybrid sampling for imbalanced data. The method includes two core components: the generation of balanced datasets and the dynamic selection of classification models. At the data level a data partition hybrid sampling (DPHS) method is proposed to balance datasets. In particular the data space is divided into four regions according to the majority class proportion in minority class neighborhoods. Then we present a boundary minority class weighted over-sampling (BMW-SMOTE) method where the weight of each minority class instance is calculated by the ratio between the majority class proportion in the neighborhood of the current instance and the sum of all these proportions. The number of synthetic instances is determined by the weight. At the algorithm level we present a model dynamic selection (MDS) strategy. Three ensemble learning models are built. Among them the local regions reinforce and weaken model adopts the balanced dataset obtained by proposed DPHS method for training to strengthen the identification of test instances on the boundary and appropriately weakens the dense distribution of majority class. The model for each test instance is selected adaptively according to the imbalance degree of its neighbors. The experimental results show that the proposed method outperforms typical imbalanced classification methods for F-measure and G-mean. (c) 2020 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available