☆ 4.7 Article

NeuroPred-FRL: an interpretable prediction model for identifying neuropeptide using feature representation learning

BRIEFINGS IN BIOINFORMATICS (2021)

期刊

BRIEFINGS IN BIOINFORMATICS

卷 22, 期 6, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bib/bbab167

关键词

neuropeptide; feature representation learning; two-step feature selection; machine learning; cross-validation

类别

Biochemical Research Methods Mathematical & Computational Biology

资金

Japan Society for the Promotion of Science (JSPS) [19H04208, 19F19377]
National Research Foundation of Korea (NRF) - Korean government (MSIT) [2021R1A2C1014338]
Grants-in-Aid for Scientific Research [19F19377] Funding Source: KAKEN
National Research Foundation of Korea [2021R1A2C1014338] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Neuropeptides play a crucial role in regulating immune systems, and rapid and accurate identification of them is essential for basic research and drug development. Using machine learning and feature representation learning, the developed NeuroPred-FRL predictor demonstrates superior prediction performance, serving as a powerful tool for large-scale identification of neuropeptides.

Neuropeptides (NPs) are the most versatile neurotransmitters in the immune systems that regulate various central anxious hormones. An efficient and effective bioinformatics tool for rapid and accurate large-scale identification of NPs is critical in immunoinformatics, which is indispensable for basic research and drug development. Although a few NP prediction tools have been developed, it is mandatory to improve their NPs' prediction performances. In this study, we have developed a machine learning-based meta-predictor called NeuroPred-FRL by employing the feature representation learning approach. First, we generated 66 optimal baseline models by employing 11 different encodings, six different classifiers and a two-step feature selection approach. The predicted probability scores of NPs based on the 66 baseline models were combined to be deemed as the input feature vector. Second, in order to enhance the feature representation ability, we applied the two-step feature selection approach to optimize the 66-D probability feature vector and then inputted the optimal one into a random forest classifier for the final meta-model (NeuroPred-FRL) construction. Benchmarking experiments based on both cross-validation and independent tests indicate that the NeuroPred-FRL achieves a superior prediction performance of NPs compared with the other state-of-the-art predictors. We believe that the proposed NeuroPred-FRL can serve as a powerful tool for large-scale identification of NPs, facilitating the characterization of their functional mechanisms and expediting their applications in clinical therapy. Moreover, we interpreted some model mechanisms of NeuroPred-FRL by leveraging the robust SHapley Additive explanation algorithm.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

主要评分

4.7

评分不足

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

NeuroPpred-Fuse: an interpretable stacking model for prediction of neuropeptides by fusing sequence information and feature selection methods

Mingming Jiang, Bowen Zhao, Shenggan Luo, Qiankun Wang, Yanyi Chu, Tianhang Chen, Xueying Mao, Yatong Liu, Yanjing Wang, Xue Jiang, Dong-Qing Wei, Yi Xiong

Summary: This study developed an interpretable stacking model, NeuroPpred-Fuse, for the prediction of neuropeptides through fusing sequence-derived features and feature selection methods. The model achieved 90.6% accuracy and 95.8% AUC on the independent test set, outperforming current state-of-the-art models, demonstrating strong generalization ability.