4.7 Article

Feature Selection for Colon Cancer Detection Using K-Means Clustering and Modified Harmony Search Algorithm

期刊

MATHEMATICS
卷 9, 期 5, 页码 -

出版社

MDPI
DOI: 10.3390/math9050570

关键词

feature selection; colorectal cancer; gene expression; K-means clustering; modified harmony search

资金

  1. National Research Foundation of Korea (NRF) - Korean government (MSIT) [2020R1A2C1A01011131]
  2. Energy Cloud R&D Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT [2019M3F2A1073164]

向作者/读者索取更多资源

This paper proposes an effective feature selection method that can distinguish colorectal cancer patients from normal individuals by utilizing data preprocessing, gene selection, clustering, and the modified harmony search algorithm, resulting in a high accuracy classification model.
This paper proposes a feature selection method that is effective in distinguishing colorectal cancer patients from normal individuals using K-means clustering and the modified harmony search algorithm. As the genetic cause of colorectal cancer originates from mutations in genes, it is important to classify the presence or absence of colorectal cancer through gene information. The proposed methodology consists of four steps. First, the original data are Z-normalized by data preprocessing. Candidate genes are then selected using the Fisher score. Next, one representative gene is selected from each cluster after candidate genes are clustered using K-means clustering. Finally, feature selection is carried out using the modified harmony search algorithm. The gene combination created by feature selection is then applied to the classification model and verified using 5-fold cross-validation. The proposed model obtained a classification accuracy of up to 94.36%. Furthermore, on comparing the proposed method with other methods, we prove that the proposed method performs well in classifying colorectal cancer. Moreover, we believe that the proposed model can be applied not only to colorectal cancer but also to other gene-related diseases.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据