4.5 Article

Prediction of interactiveness of proteins and nucleic acids based on feature selections

期刊

MOLECULAR DIVERSITY
卷 14, 期 4, 页码 627-633

出版社

SPRINGER
DOI: 10.1007/s11030-009-9198-9

关键词

Nucleic acid and protein interaction; Feature selection; Prediction; mRMR; Forward feature wrapper

资金

  1. National Basic Research Program of China [2004CB518603]
  2. Shanghai Commission for Science and Technology [KSCX2-YW-R-112]
  3. Shanghai Leading Academic Discipline Project [J50101]
  4. Excellent Young Teachers Program of Shanghai [B.37-0101-07-716]

向作者/读者索取更多资源

It is important to identify which proteins can interact with nucleic acids for the purpose of protein annotation, since interactions between nucleic acids and proteins involve in numerous cellular processes such as replication, transcription, splicing, and DNA repair. This research tries to identify proteins that can interact with DNA, RNA, and rRNA, respectively. mRMR (Minimum redundancy and maximum relevance), with its elegant mathematical formulation, has been applied widely in processing biological data and feature analysis since its introduction in 2005. mRMR plus incremental feature selection (IFS) is known to be very efficient in feature selection and analysis, and able to improve both effectiveness and efficiency of a prediction model. IFS is applied to decide how many features should be selected from feature list provided by mRMR. In the end, the selected features of mRMR and IFS are further refined by a conventional feature selection method-forward feature wrapper (FFW), by reordering the features. Each protein is coded by 132 features including amino acid compositions and physicochemical properties. After the feature selection, k-Nearest Neighbor algorithm, the adopted prediction model, is trained and tested. As a result, the optimized prediction accuracies for the DNA, RNA, and rRNA are 82.0, 83.4, and 92.3%, respectively. Furthermore, the most important features that contribute to the prediction are identified and analyzed biologically. The predictor, developed for this research, is available for public access at http://chemdata.shu.edu.cn/protein_na_mrmr/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据