4.7 Article

Merging Applicability Domains for in Silico Assessment of Chemical Mutagenicity

期刊

出版社

AMER CHEMICAL SOC
DOI: 10.1021/ci500016v

关键词

-

资金

  1. U.S. Army Medical Research and Materiel Command (Ft. Detrick, MD), as part of the U.S. Army's Network Science Initiative
  2. Defense Threat Reduction Agency [CBCall14-CBS-05-2-0007]

向作者/读者索取更多资源

Using a benchmark Ames mutagenicity data set, we evaluated the performance of molecular fingerprints as descriptors for developing quantitative structure activity relationship (QSAR) models and defining applicability domains with two machine-learning methods: random forest (RF) and variable nearest neighbor (v-NN). The two methods focus on complementary aspects of chemical mutagenicity and use different characteristics of the molecular fingerprints to achieve high levels of prediction accuracies. Thus, while RF flags mutagenic compounds using the presence or absence of small molecular fragments akin to structural alerts, the v-NN method uses molecular structural similarity as measured by fingerprint-based Tanimoto distances between molecules. We showed that the extended connectivity fingerprints could intuitively be used to define and quantify an applicability domain for either method. The importance of using applicability domains in QSAR modeling cannot be understated; compounds that are outside the applicability domain do not have any close representative in the training set, and therefore, we cannot make reliable predictions. Using either approach, we developed highly robust models that rival the performance of a state-of-the-art proprietary software package. Importantly, based on the complementary approach used by the methods, we showed that by combining the model predictions we raised the applicability domain from roughly 80% to 90%. These results indicated that the proposed QSAR protocol constituted a highly robust chemical mutagenicity prediction model.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据