4.5 Article

Conformal Prediction Classification of a Large Data Set of Environmental Chemicals from ToxCast and Tox21 Estrogen Receptor Assays

期刊

CHEMICAL RESEARCH IN TOXICOLOGY
卷 29, 期 6, 页码 1003-1010

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.chemrestox.6b00037

关键词

-

向作者/读者索取更多资源

Quantitative structure activity relationships (QSAR) are critical to exploitation of the chemical information in toxicology databases. Exploitation can be extraction of chemical knowledge from the data but also making predictions of new chemicals based on quantitative analysis of past findings. In this study, we analyzed the ToxCast and Tox21 estrogen receptor data sets using Conformal Prediction to enhance the full exploitation of the information in these data sets. We applied aggregated conformal prediction (ACP) to the ToxCast and Tox21 estrogen receptor data sets using support vector machine classifiers to compare overall performance of the models but, more importantly, to explore the performance of ACP on data sets that are significantly enriched in one class without employing sampling strategies of the training set. ACP was also used to investigate the problem of applicability domain using both data sets. Comparison of ACP to previous results obtained on the same data sets using traditional QSAR approaches indicated similar overall balanced performance to methods in which careful training set selections were made, e.g, sensitivity and specificity for the external Tox21 data set of 70-75% and far superior results to those obtained using traditional methods without training set sampling where the corresponding results showed a dear imbalance of 50 and 96%, respectively. Application of conformal prediction to imbalanced data sets facilitates an unambiguous analysis of all data, allows accurate predictive models to be built which display similar accuracy in external validation to external validation, and, most importantly, allows an unambiguous treatment of the applicability domain.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据