4.7 Article

Robust and accurate prediction of protein-protein interactions by exploiting evolutionary information

Journal

SCIENTIFIC REPORTS
Volume 11, Issue 1, Pages -

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41598-021-96265-z

Keywords

-

Funding

  1. National Science Foundation of China [61873212, 61722212, 61902342, 62072378]

Ask authors/readers for more resources

This study introduces a computational method for predicting PPIs based on protein sequence information, utilizing a combination of OLPP and RoF models to identify non-interacting and interacting protein pairs with high accuracy on Yeast and Human datasets. The proposed method serves as a valuable tool in accelerating the resolution of key problems in proteomics.
Various biochemical functions of organisms are performed by protein-protein interactions (PPIs). Therefore, recognition of protein-protein interactions is very important for understanding most life activities, such as DNA replication and transcription, protein synthesis and secretion, signal transduction and metabolism. Although high-throughput technology makes it possible to generate large-scale PPIs data, it requires expensive cost of both time and labor, and leave a risk of high false positive rate. In order to formulate a more ingenious solution, biology community is looking for computational methods to quickly and efficiently discover massive protein interaction data. In this paper, we propose a computational method for predicting PPIs based on a fresh idea of combining orthogonal locality preserving projections (OLPP) and rotation forest (RoF) models, using protein sequence information. Specifically, the protein sequence is first converted into position-specific scoring matrices (PSSMs) containing protein evolutionary information by using the Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then we characterize a protein as a fixed length feature vector by applying OLPP to PSSMs. Finally, we train an RoF classifier for the purpose of identifying non-interacting and interacting protein pairs. The proposed method yielded a significantly better results than existing methods, with 90.07% and 96.09% prediction accuracy on Yeast and Human datasets. Our experiment show the proposed method can serve as a useful tool to accelerate the process of solving key problems in proteomics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Biotechnology & Applied Microbiology

Hierarchical graph attention network for miRNA-disease association prediction

Zhengwei Li, Tangbo Zhong, Deshuang Huang, Zhu-Hong You, Ru Nie

Summary: In this study, a novel deep learning model called HGANMDA was proposed to predict miRNA-disease associations. The model constructed a heterogeneous graph and utilized hierarchical graph attention network to achieve accurate prediction of miRNA-disease associations.

MOLECULAR THERAPY (2022)

Article Biochemical Research Methods

GraphTGI: an attention-based graph embedding model for predicting TF-target gene interactions

Zhi-Hua Du, Yang-Han Wu, Yu-An Huang, Jie Chen, Gui-Qing Pan, Lun Hu, Zhu-Hong You, Jian-Qiang Li

Summary: This study introduces a graph attention-based autoencoder model to predict TF-target gene interactions, which shows excellent prediction performance on a real dataset.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemical Research Methods

iGRLCDA: identifying circRNA-disease association based on graph representation learning

Han-Yuan Zhang, Lei Wang, Zhu-Hong You, Lun Hu, Bo-Wei Zhao, Zheng-Wei Li, Yang-Ming Li

Summary: Researchers have discovered a novel topology of RNA transcript called circular RNA (circRNA) that competes with messenger RNA (mRNA) and long noncoding RNA in gene regulation. This finding suggests that circRNA could be associated with complex diseases, thus identifying the relationship between them would contribute to medical research. However, in vitro experiments to determine the circRNA-disease association are time-consuming and lack direction. To address this, a computational method called iGRLCDA was proposed, which utilizes graph convolution network (GCN) and graph factorization (GF) to predict circRNA-disease associations. The performance of iGRLCDA was compared to other prediction models using five-fold cross-validation, showing strong competitiveness and high accuracy.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemical Research Methods

Attention-based Knowledge Graph Representation Learning for Predicting Drug-drug Interactions

Xiaorui Su, Lun Hu, Zhuhong You, Pengwei Hu, Bowei Zhao

Summary: This paper proposes a KG-based drug-drug interaction (DDI) prediction framework called DDKG, which utilizes KG information and attention mechanism. Experimental results show that DDKG outperforms existing algorithms on different evaluation metrics.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biotechnology & Applied Microbiology

BioDKG-DDI: predicting drug-drug interactions based on drug knowledge graph fusing biochemical information

Zhong-Hao Ren, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Yong-Jian Guan, Xin-Fei Wang, Jie Pan

Summary: Co-administration of drugs is an effective strategy for treating complex diseases, but predicting drug-drug interactions (DDIs) accurately is challenging. In this paper, a novel method called BioDKG-DDI is proposed to predict potential DDIs by integrating multiple features and biochemical information using an attention mechanism and deep neural network. Experimental results demonstrate that this method is robust and simple, and can serve as a beneficial supplement to the experimental process.

BRIEFINGS IN FUNCTIONAL GENOMICS (2022)

Article Genetics & Heredity

SAWRPI: A Stacking Ensemble Framework With Adaptive Weight for Predicting ncRNA-Protein Interactions Using Sequence Information

Zhong-Hao Ren, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Yong-Jian Guan, Yue-Chao Li, Jie Pan

Summary: Non-coding RNAs (ncRNAs) play important roles in biological processes through interactions with RNA binding proteins (RBPs). Computational methods have been developed to predict ncRNA-protein interactions, but some of them have limited applicability. In this study, a computational method called SAWRPI is proposed to predict ncRNA-protein interactions using sequence information. The method achieved high performance in experiments, showing its potential as a reliable tool for predicting ncRNA-protein interactions.

FRONTIERS IN GENETICS (2022)

Article Biology

BioChemDDI: Predicting Drug-Drug Interactions by Fusing Biochemical and Structural Information through a Self-Attention Mechanism

Zhong-Hao Ren, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Jie Pan, Yong-Jian Guan, Lu-Xiang Guo

Summary: Combining drugs to fight against diseases has a long history, but potential drug interactions can lead to unknown toxicity. Our study introduces a computational framework and an online tool for researchers to identify potential interactions in the fields of biomedicine and pharmacology. This approach provides new insights for rapidly identifying drug-drug interactions.

BIOLOGY-BASEL (2022)

Article Biochemistry & Molecular Biology

SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks

Ying Wang, Lin-Lin Wang, Leon Wong, Yang Li, Lei Wang, Zhu-Hong You

Summary: Protein is the fundamental organic substance in cells that plays a crucial role in biological activities. Self-interacting protein (SIP) is an important protein interaction. This study presents a SIP prediction method, SIPGCN, using a deep learning graph convolutional network (GCN). The results demonstrate excellent performance of SIPGCN.

BIOMEDICINES (2022)

Article Biology

Predicting Protein-Protein Interactions Based on Ensemble Learning-Based Model from Protein Sequence

Xinke Zhan, Mang Xiao, Zhuhong You, Chenggang Yan, Jianxin Guo, Liping Wang, Yaoqi Sun, Bingwan Shang

Summary: This paper proposes a computational method for predicting protein-protein interactions from protein sequences. The method utilizes PSSM, LPP, and RF for feature extraction and classification. Experimental results demonstrate that the method is stable, accurate, and promising as a useful tool for proteomics research.

BIOLOGY-BASEL (2022)

Article Biochemical Research Methods

Line graph attention networks for predicting disease-associated Piwi-interacting RNAs

Kai Zheng, Xin-Lu Zhang, Lei Wang, Zhu-Hong You, Zhao-Hui Zhan, Hao-Yuan Li

Summary: PIWI proteins and piRNAs are commonly found in human cancers and are associated with poorer clinical outcomes. A new graph neural network framework called line graph attention networks (LGAT) is developed for predicting the association between PiRNAs and diseases. Experimental results show that LGAT performs excellently in identifying potential associations.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemical Research Methods

Identifying Protein Complexes From Protein-Protein Interaction Networks Based on Fuzzy Clustering and GO Semantic Information

Xiangyu Pan, Lun Hu, Pengwei Hu, Zhu-Hong You

Summary: Protein complexes play a crucial role in understanding protein biological processes. In this study, we propose a novel fuzzy-based clustering framework called FCAN-PCI, which considers both network topology and protein attribute information to improve identification accuracy and identify overlapping complexes.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2022)

Review Computer Science, Information Systems

In silico prediction methods of self-interacting proteins: an empirical and academic survey

Zhanheng Chen, Zhuhong You, Qinhu Zhang, Zhenhao Guo, Siguo Wang, Yanbin Wang

Summary: This review provides a comprehensive overview of recent literature on computational prediction of self-interacting proteins (SIPs), serving as a valuable reference for future work. The review first describes the data required for predicting drug-target interactions (DTIs), followed by the presentation of interesting feature extraction methods and computational models. An empirical comparison is then conducted to demonstrate the prediction performance of various classifiers under different feature extraction and encoding schemes. Overall, potential methods for further enhancing SIPs prediction performance and related research directions are summarized and highlighted.

FRONTIERS OF COMPUTER SCIENCE (2023)

Article Automation & Control Systems

MGRCDA: Metagraph Recommendation Method for Predicting CircRNA-Disease Association

Lei Wang, Zhu-Hong You, De-Shuang Huang, Jian-Qiang Li

Summary: This study presents a new computational model, MGRCDA, which utilizes metagraph recommendation theory to predict potential circRNA-disease associations. By integrating heterogeneous biological networks and utilizing an iterative search algorithm, MGRCDA achieved high prediction accuracy and reliability. The experimental results demonstrate its feasibility and efficiency in reducing the scope of wet-lab experiments.

IEEE TRANSACTIONS ON CYBERNETICS (2023)

Article Biology

MFIDMA: A Multiple Information Integration Model for the Prediction of Drug-miRNA Associations

Yong-Jian Guan, Chang-Qing Yu, Yan Qiao, Li-Ping Li, Zhu-Hong You, Zhong-Hao Ren, Yue-Chao Li, Jie Pan

Summary: This study presents a computational method called MFIDMA for predicting drug-miRNA associations. The proposed model demonstrates excellent performance in experiments and can be used for the development and research of miRNA-targeted drugs, providing new perspectives on miRNA therapeutics research and drug discovery.

BIOLOGY-BASEL (2023)

Article Biochemical Research Methods

PTBGRP: predicting phage-bacteria interactions with graph representation learning on microbial heterogeneous information network

Jie Pan, Zhuhong You, Wencai You, Tian Zhao, Chenlu Feng, Xuexia Zhang, Fengzhi Ren, Sanxing Ma, Fan Wu, Shiwei Wang, Yanmei Sun

Summary: This study developed a model called PTBGRP based on microbial heterogeneous interaction network to predict new phages for bacterial hosts. By integrating different biological attributes and topological features, a deep neural network classifier was used to predict unknown PBI pairs. Experimental results demonstrated that PTBGRP achieved the best performance on pathogen and PBI datasets.

BRIEFINGS IN BIOINFORMATICS (2023)

No Data Available