☆ 4.5 Article

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models

INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES (2022)

Journal

INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES

Volume 14, Issue 1, Pages 209-232

Publisher

SPRINGER HEIDELBERG

DOI: 10.1007/s12539-021-00483-y

Keywords

lncRNA-protein interaction; Adaptive k-nearest neighbor; Deep neural network; Deep forest; Ensemble learning

Funding

National Natural Science-Foundation of China [62072172, 61803151]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The EnANNDeep ensemble learning framework is proposed in this study for predicting lncRNA-protein interactions. It integrates multiple datasets and features, and utilizes adaptive k-nearest neighbor classifier and deep models for analysis. The predictions from the three predictors are integrated using a soft voting technique, achieving good predictive performance.

lncRNA-protein interactions (LPIs) prediction can deepen the understanding of many important biological processes. Artificial intelligence methods have reported many possible LPIs. However, most computational techniques were evaluated mainly on one dataset, which may produce prediction bias. More importantly, they were validated only under cross validation on lncRNA-protein pairs, and did not consider the performance under cross validations on lncRNAs and proteins, thus fail to search related proteins/lncRNAs for a new lncRNA/protein. Under an ensemble learning framework (EnANNDeep) composed of adaptive k-nearest neighbor classifier and Deep models, this study focuses on systematically finding underlying linkages between lncRNAs and proteins. First, five LPI-related datasets are arranged. Second, multiple source features are integrated to depict an lncRNA-protein pair. Third, adaptive k-nearest neighbor classifier, deep neural network, and deep forest are designed to score unknown lncRNA-protein pairs, respectively. Finally, interaction probabilities from the three predictors are integrated based on a soft voting technique. In comparing to five classical LPI identification models (SFPEL, PMDKN, CatBoost, PLIPCOM, and LPI-SKF) under fivefold cross validations on lncRNAs, proteins, and LPIs, EnANNDeep computes the best average AUCs of 0.8660, 0.8775, and 0.9166, respectively, and the best average AUPRs of 0.8545, 0.8595, and 0.9054, respectively, indicating its superior LPI prediction ability. Case study analyses indicate that SNHG10 may have dense linkage with Q15717. In the ensemble framework, adaptive k-nearest neighbor classifier can separately pick the most appropriate k for each query lncRNA-protein pair. More importantly, deep models including deep neural network and deep forest can effectively learn the representative features of lncRNAs and proteins.

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models

Journal

INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models

Journal

INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper