☆ 4.7 Article

Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets

INFORMATION SCIENCES (2016)

Journal

INFORMATION SCIENCES

Volume 354, Issue -, Pages 178-196

Publisher

ELSEVIER SCIENCE INC

DOI: 10.1016/j.ins.2016.02.056

Keywords

Imbalanced datasets; Tree-based ensembles; Ordering-based pruning; Bagging; Boosting

Funding

Spanish Ministry of Science and Technology [TIN-2011-28488, TIN2013-40765-P, TIN2014-57251-P]
Andalusian Research Plans [P11-TIC-7765, P10-TIC-6858]
University of Jaen [UJA2014/06/15]
Caja Rural Provincial de Jaen [UJA2014/06/15]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The scenario of classification with imbalanced datasets has gained a notorious significance in the last years. This is due to the fact that a large number of problems where classes are highly skewed may be found, affecting the global performance of the system. A great number of approaches have been developed to address this problem. These techniques have been traditionally proposed under three different perspectives: data treatment, adaptation of algorithms, and cost-sensitive learning. Ensemble-based models for classifiers are an extension over the former solutions. They consider a pool of classifiers, and they can in turn integrate any of these proposals. The quality and performance of this type of methodology over baseline solutions have been shown in several studies of the specialized literature. The goal of this work is to improve the capabilities of tree-based ensemble-based solutions that were specifically designed for imbalanced classification, focusing on the best behaving bagging- and boosting-based ensembles in this scenario. In order to do so, this paper proposes several new metrics for ordering-based pruning, which are properly adapted to address the skewed-class distribution. From our experimental study we show two main results: on the one hand, the use of the new metrics allows pruning to become a very successful approach in this scenario; on the other hand, the behavior of Under-Bagging model excels, achieving the highest gain with the usage of pruning, since the random undersampled sets that best complement each other can be selected. Accordingly, this scheme is capable of outperforming previous ensemble models selected from the state-of-the-art. (C) 2016 Elsevier Inc. All rights reserved.

Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Ordering-based pruning for improving the performance of ensembles of classifiers in the framework of imbalanced datasets

Journal

INFORMATION SCIENCES

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper