期刊
BMC BIOINFORMATICS
卷 20, 期 1, 页码 -出版社
BMC
DOI: 10.1186/s12859-019-3135-4
关键词
Ensemble-learning; Meta-learning; Drug-prediction
类别
资金
- Seoul National University
- National Research Foundation of Korea (NRF) - Korea government (Ministry of Science and ICT) [2014M3C9A3063541, 2018R1A2B3001628]
- Brain Korea 21 Plus Project in 2018
- Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI) - Ministry of Health and Welfare, Republic of Korea [HI15C3224]
Background Quantitative structure-activity relationship (QSAR) is a computational modeling method for revealing relationships between structural properties of chemical compounds and biological activities. QSAR modeling is essential for drug discovery, but it has many constraints. Ensemble-based machine learning approaches have been used to overcome constraints and obtain reliable predictions. Ensemble learning builds a set of diversified models and combines them. However, the most prevalent approach random forest and other ensemble approaches in QSAR prediction limit their model diversity to a single subject. Results The proposed ensemble method consistently outperformed thirteen individual models on 19 bioassay datasets and demonstrated superiority over other ensemble approaches that are limited to a single subject. The comprehensive ensemble method is publicly available at . Conclusions We propose a comprehensive ensemble method that builds multi-subject diversified models and combines them through second-level meta-learning. In addition, we propose an end-to-end neural network-based individual classifier that can automatically extract sequential features from a simplified molecular-input line-entry system (SMILES). The proposed individual models did not show impressive results as a single model, but it was considered the most important predictor when combined, according to the interpretation of the meta-learning.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据