☆ 4.8 Article

Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications

NUCLEIC ACIDS RESEARCH (2020)

期刊

NUCLEIC ACIDS RESEARCH

卷 48, 期 19, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/nar/gkaa829

关键词

类别

Biochemistry & Molecular Biology

资金

National Human Genome Research Institute [R01HG010149]
National Institutes of Health [R01HG10759]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The ability to characterize repetitive regions of the human genome is limited by the read lengths of short-read sequencing technologies. Although long-read sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies can potentially overcome this limitation, long segmental duplications with high sequence identity pose challenges for long-read mapping. We describe a probabilistic method, DuploMap, designed to improve the accuracy of long-read mapping in segmental duplications. It analyzes reads mapped to segmental duplications using existing long-read aligners and leverages paralogous sequence variants (PSVs)-sequence differences between paralogous sequences-to distinguish between multiple alignment locations. On simulated datasets, DuploMap increased the percentage of correctly mapped reads with high confidence for multiple long-read aligners including Minimap2 (74.3-90.6%) and BLASR (82.9-90.7%) while maintaining high precision. Across multiple whole-genome long-read datasets, DuploMap aligned an additional 8-21% of the reads in segmental duplications with high confidence relative to Minimap2. Using DuploMap-aligned PacBio circular consensus sequencing reads, an additional 8.9 Mb of DNA sequence was mappable, variant calling achieved a higher F-1 score and 14 713 additional variants supported by linked-read data were identified. Finally, we demonstrate that a significant fraction of PSVs in segmental duplications overlaps with variants and adversely impacts short-read variant calling.

Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications

期刊

NUCLEIC ACIDS RESEARCH

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文