Journal
BMC BIOINFORMATICS
Volume 22, Issue 1, Pages -Publisher
BMC
DOI: 10.1186/s12859-021-04069-9
Keywords
ncRNA-protein interactions; Multi-scale features combination; Conjoint k-mer; Ensemble deep learning; Independent test; ncRNA-protein networks
Categories
Funding
- Beijing Natural Science Foundation [2202002]
- Chinese Natural Science Foundation [21173014]
Ask authors/readers for more resources
The study presents an ensemble deep learning-based method, EDLMFC, which shows high accuracy in predicting ncRNA-protein interactions. Independent tests demonstrate its effectiveness in predicting potential interactions between different organisms.
Background: Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA-protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA-protein interactions. Results: In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA-protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPlnter v2.0, and RP1488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA-protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA-protein networks of Mus musculus successfully. Conclusions: In general, our proposed method EDLMFC improved the accuracy of ncRNA-protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available