4.6 Article Proceedings Paper

Study of the impact of resarnpling methods for contrast pattern based classifiers in imbalanced databases

Journal

NEUROCOMPUTING
Volume 175, Issue -, Pages 935-947

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.neucom.2015.04.120

Keywords

Supervised classification; Resampling methods; Imbalanced databases; Contrast patterns

Funding

  1. National Council of Science and Technology of Mexico [CB2008-106366, 370272]

Ask authors/readers for more resources

The class imbalance problem is a challenge in supervised classification, since many classifiers are sensitive to class distribution, biasing their prediction towards the majority class. Usually, in imbalanced databases, contrast pattern miners extract a very large collection of patterns from the majority class but only a few patterns (or none) from the minority class. It causes that minority class objects have low support and they could be identified as noise and consequently discarded by the contrast pattern based classifier biasing the results towards the majority class. In the literature, the class imbalance problem is commonly faced by applying resampling methods. Therefore, in this paper, we present a study about the impact of using resampling methods for improving the performance of contrast pattern based classifiers in class imbalance problems. Experimental results using standard imbalanced databases show that there are statistically significant differences between using the classifier before and after applying resampling methods. Moreover, from this study, we provide a guide based on the class imbalance ratio for selecting a resampling method that jointly with a contrast pattern based classifier allows us to have good results in a class imbalance problem. (C) 2015 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available