4.6 Article

Updated benchmarking of variant effect predictors using deep mutational scanning

期刊

MOLECULAR SYSTEMS BIOLOGY
卷 19, 期 8, 页码 -

出版社

WILEY
DOI: 10.15252/msb.202211474

关键词

Benchmark; Circularity; DMS; MAVE; VEP

向作者/读者索取更多资源

This study evaluates 55 different Variant Effect Predictor (VEP) using independently generated protein function measurements from deep mutational scanning (DMS) experiments for 26 human proteins, while minimizing data circularity. The top-performing VEPs are mostly unsupervised methods including EVE, DeepSequence, and the protein language model ESM-1v. However, recent supervised VEPs like VARITY also show strong performance, indicating a serious consideration of data circularity and bias issues by developers. The assessment of DMS and unsupervised VEPs for variant classification is mixed, with some DMS datasets performing exceptionally well while others perform poorly. Notably, a strong correlation is observed between VEP agreement with DMS data and the ability to identify clinically relevant variants, supporting the validity of rankings and the utility of DMS for independent benchmarking.
The assessment of variant effect predictor (VEP) performance is fraught with biases introduced by benchmarking against clinical observations. In this study, building on our previous work, we use independently generated measurements of protein function from deep mutational scanning (DMS) experiments for 26 human proteins to benchmark 55 different VEPs, while introducing minimal data circularity. Many top-performing VEPs are unsupervised methods including EVE, DeepSequence and ESM-1v, a protein language model that ranked first overall. However, the strong performance of recent supervised VEPs, in particular VARITY, shows that developers are taking data circularity and bias issues seriously. We also assess the performance of DMS and unsupervised VEPs for discriminating between known pathogenic and putatively benign missense variants. Our findings are mixed, demonstrating that some DMS datasets perform exceptionally at variant classification, while others are poor. Notably, we observe a striking correlation between VEP agreement with DMS data and performance in identifying clinically relevant variants, strongly supporting the validity of our rankings and the utility of DMS for independent benchmarking.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据