4.6 Article

Resolving repeat families with long reads

期刊

BMC BIOINFORMATICS
卷 20, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s12859-019-2807-4

关键词

Genome assembly; Repeat families; Repeat resolution

资金

  1. Klaus Tschira Foundation

向作者/读者索取更多资源

BackgroundDraft quality genomes for a multitude of organisms have become common due to the advancement of genome assemblers using long-read technologies with high error rates. Although current assemblies are substantially more contiguous than assemblies based on short reads, complete chromosomal assemblies are still challenging. Interspersed repeat families with multiple copy versions dominate the contig and scaffold ends of current long-read assemblies for complex genomes. These repeat families generally remain unresolved, as existing algorithmic solutions either do not scale to large copy numbers or can not handle the current high read error rates.ResultsWe propose novel repeat resolution methods for large interspersed repeat families and assess their accuracy on simulated data sets with various distinct repeat structures and on drosophila melanogaster transposons. Additionally, we compare our methods to an existing long read repeat resolution tool and show the improved accuracy of our method.ConclusionsOur results demonstrate the applicability of our methods for the improvement of the contiguity of genome assemblies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据