期刊
FRONTIERS IN GENETICS
卷 11, 期 -, 页码 -出版社
FRONTIERS MEDIA SA
DOI: 10.3389/fgene.2020.626500
关键词
protein subcellular localization; network embedding; functional embedding; gene ontology; KEGG pathway
资金
- Strategic Priority Research Program of Chinese Academy of Sciences [XDB38050200]
- National Key R&D Program of China [2018YFC0910403, 2017YFC1201200]
- Shanghai Municipal Science and Technology Major Project [2017SHZDZX01]
- National Natural Science Foundation of China [31701151]
- Shanghai Sailing Program [16YF1413800]
- Youth Innovation Promotion Association of Chinese Academy of Sciences (CAS) [2016245]
- Fund of the Key Laboratory of Tissue Microenvironment and Tumor of Chinese Academy of Sciences [202002]
This study introduces an embedding-based method for predicting the subcellular localization of proteins by learning functional and network embeddings. The combined embeddings result in a novel representation of protein locations, leading to a final classification model with superior performance compared to conventional methods, as demonstrated in a benchmark dataset with 4,861 proteins from 16 locations.
The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein representations. In this study, we present an embedding-based method for predicting the subcellular localization of proteins. We first learn the functional embeddings of KEGG/GO terms, which are further used in representing proteins. Then, we characterize the network embeddings of proteins on a protein-protein network. The functional and network embeddings are combined as novel representations of protein locations for the construction of the final classification model. In our collected benchmark dataset with 4,861 proteins from 16 locations, the best model shows a Matthews correlation coefficient of 0.872 and is thus superior to multiple conventional methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据