4.7 Article

Adaptive sampling using self-paced learning for imbalanced cancer data pre-diagnosis

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 152, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.113334

关键词

Imbalanced classification; Adaptive sampling; Cancer pre-diagnosis; Elastic-net regularization

资金

  1. National Natural Science Foundation of China [61703416]
  2. Natural Science Foundation of Hunan Province of China [2018JJ3614]
  3. Postgraduate Research Innovation Project from Hunan Provincial Department of Education [CX20190040]

向作者/读者索取更多资源

The early diagnosis of cancer diseases is an indispensable part in the cancer research. It urges people to develop many new machine learning approaches to assist the diseases identification based on the gene expression data. However, the race occurrence of malignant tumors creates a challenge due to the potential over-fitting risk in the current model training. Typically, people use various sampling methods (e.g., random oversampling and undersampling) to address this challenge to provide a balanced data distribution. However, these methods might discard potentially useful samples. In this paper, we proposed an imbalanced sampling approach via self-paced learning (ISPL) to effectively select high-quality samples to improve the robustness. The experimental results showed that our proposed ISPL method increased the classification accuracy by approximately 16% compared with the average performance obtained by other sampling methods. In addition, the new method successfully selected some important genes for further investigation. (C) 2020 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据