期刊
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY
卷 8, 期 -, 页码 -出版社
FRONTIERS MEDIA SA
DOI: 10.3389/fbioe.2020.00730
关键词
decision tree; human interactome; prediction; protein-protein interaction; quantitative feature
资金
- Shanghai Municipal Science and Technology Major Project [2017SHZDZX01]
- National Key R&D Program of China [2018YFC0910403]
- National Natural Science Foundation of China [31701151]
- Natural Science Foundation of Shanghai [17ZR1412500]
- Shanghai Sailing Program [16YF1413800]
- Youth Innovation Promotion Association of Chinese Academy of Sciences (CAS) [2016245]
Protein is one of the most significant components of all living creatures. All significant and essential biological structures and functions relies on proteins and their respective biological functions. However, proteins cannot perform their unique biological significance independently. They have to interact with each other to realize the complicated biological processes in all living creatures including human beings. In other words, proteins depend on interactions (protein-protein interactions) to realize their significant effects. Thus, the significance comparison and quantitative contribution of candidate PPI features must be determined urgently. According to previous studies, 258 physical and chemical characteristics of proteins have been reported and confirmed to definitively affect the interaction efficiency of the related proteins. Among such features, essential physiochemical features of proteins like stoichiometric balance, protein abundance, molecular weight and charge distribution have been validated to be quite significant and irreplaceable for protein-protein interactions (PPIs). Therefore, in this study, we, on one hand, presented a novel computational framework to identify the key factors affecting PPIs with Boruta feature selection (BFS), Monte Carlo feature selection (MCFS), incremental feature selection (IFS), and on the other hand, built a quantitative decision-rule system to evaluate the potential PPIs under real conditions with random forest (RF) and RIPPER algorithms, thereby supplying several new insights into the detailed biological mechanisms of complicated PPIs. The main datasets and codes can be downloaded at. https://github.com/xypan1232/Mass-PPI.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据