4.8 Article

BlackOPs: increasing confidence in variant detection through mappability filtering

期刊

NUCLEIC ACIDS RESEARCH
卷 41, 期 19, 页码 -

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkt692

关键词

-

资金

  1. National Institutes of Health [U24 CA143848, U24 CA143848-02S1, F32 CA142039]
  2. Direct For Biological Sciences [0850237] Funding Source: National Science Foundation
  3. Direct For Computer & Info Scie & Enginr
  4. Div Of Information & Intelligent Systems [1054631] Funding Source: National Science Foundation
  5. Emerging Frontiers [0850237] Funding Source: National Science Foundation

向作者/读者索取更多资源

Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据