4.3 Article

DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

期刊

ONCOTARGET
卷 9, 期 2, 页码 1944-1956

出版社

IMPACT JOURNALS LLC
DOI: 10.18632/oncotarget.23099

关键词

DNase I hypersensitive site; feature selection; machine learning; random forest; support vector machine

资金

  1. Basic Science Research Program through the National Research Foundation (NRF) of Korea - Ministry of Education, Science and Technology [2015R1D1A1A09060192]
  2. Priority Research Centers Program through the National Research Foundation of Korea (NRF) - Ministry of Education, Science and Technology [2009-0093826]
  3. Brain Research Program through the National Research Foundation of Korea (NRF) - Ministry of Science, ICT & Future Planning [2016M3C7A1904392]

向作者/读者索取更多资源

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di-and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据