4.7 Article

Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting

期刊

SCIENTIFIC REPORTS
卷 8, 期 -, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-018-32511-1

关键词

-

资金

  1. National Natural Science Foundation of China [61672541]
  2. Natural Science Foundation of Hunan Province [2017JJ3287]

向作者/读者索取更多资源

Identification of hot spots, a small portion of protein-protein interface residues that contribute the majority of the binding free energy, can provide crucial information for understanding the function of proteins and studying their interactions. Based on our previous method (PredHS), we propose a new computational approach, PredHS2, that can further improve the accuracy of predicting hot spots at protein-protein interfaces. Firstly we build a new training dataset of 313 alanine-mutated interface residues extracted from 34 protein complexes. Then we generate a wide variety of 600 sequence, structure, exposure and energy features, together with Euclidean and Voronoi neighborhood properties. To remove redundant and irrelevant information, we select a set of 26 optimal features utilizing a two-step feature selection method, which consist of a minimum Redundancy Maximum Relevance (mRMR) procedure and a sequential forward selection process. Based on the selected 26 features, we use Extreme Gradient Boosting (XGBoost) to build our prediction model. Performance of our PredHS2 approach outperforms other machine learning algorithms and other state-of-the-art hot spot prediction methods on the training dataset and the independent test set (BID) respectively. Several novel features, such as solvent exposure characteristics, second structure features and disorder scores, are found to be more effective in discriminating hot spots. Moreover, the update of the training dataset and the new feature selection and classification algorithms play a vital role in improving the prediction quality.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据