4.3 Article

Classifying Imbalanced Data Sets by a Novel RE-Sample and Cost-Sensitive Stacked Generalization Method

期刊

MATHEMATICAL PROBLEMS IN ENGINEERING
卷 2018, 期 -, 页码 -

出版社

HINDAWI LTD
DOI: 10.1155/2018/5036710

关键词

-

资金

  1. Natural Science Foundation of Shanxi Province [2015011039]

向作者/读者索取更多资源

Learning with imbalanced data sets is considered as one of the key topics in machine learning community. Stacking ensemble is an efficient algorithm for normal balance data sets. However, stacking ensemble was seldom applied in imbalance data. In this paper, we proposed a novel RE-sample and Cost-Sensitive Stacked Generalization (RECSG) method based on 2-layer learning models. The first step is Level 0 model generalization including data preprocessing and base model training. The second step is Level 1 model generalization involving cost-sensitive classifier and logistic regression algorithm. In the learning phase, preprocessing techniques can be embedded in imbalance data learning methods. In the cost-sensitive algorithm, cost matrix is combined with both data characters and algorithms. In the RECSG method, ensemble algorithm is combined with imbalance data techniques. According to the experiment results obtained with 17 public imbalanced data sets, as indicated by various evaluation metrics (AUC, Geo Mean, and AGeoMean), the proposed method showed the better classification performances than other ensemble and single algorithms. The proposed method is especially more efficient when the performance of base classifier is low. All these demonstrated that the proposed method could be applied in the class imbalance problem.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据