4.3 Review

Systematic literature review of preprocessing techniques for imbalanced data

期刊

IET SOFTWARE
卷 13, 期 6, 页码 479-496

出版社

WILEY
DOI: 10.1049/iet-sen.2018.5193

关键词

learning (artificial intelligence); reviews; systematic literature review method; systematic literature review guidelines; data sets; data quality; machine learning; imbalanced data preprocessing techniques; defect reduction; quality assessment criteria

资金

  1. Ministry of Education under University of Malaya High Impact Research grant [UM.C/625/1/HIR/MOHE/FCSIT/13]
  2. Ministry of Education under Fundamental Research Grant Scheme (FRGS) [FP001-2016]

向作者/读者索取更多资源

Data preprocessing remains an important step in machine learning studies. This is because proper preprocessing of imbalanced data can enable researchers to reduce defects as much as possible, which, in turn, may lead to the elimination of defects in existing data sets. Despite the remarkable achievements that have been accomplished in machine learning studies, systematic literature reviews of imbalanced data preprocessing techniques are lacking. Consequently, there are a limited number of systematic literature review studies on imbalanced data preprocessing. In this study, the authors assess the existing literature to identify the key issues related to data quality and handling and to provide a convenient collection of the techniques used to address these issues when performing data preprocessing. They applied a systematic literature review method involving a manual search to select articles published from January 2010 to September 2018 for review. The qualities of the existing studies were assessed using certain quality assessment criteria. Of the 118 relevant studies found, only 2% were identified as having been conducted following systematic literature review guidelines. This study, therefore, calls for more systematic literature review studies on data preprocessing to improve the quality of the data applied in machine learning studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据