4.7 Article

Comparative analysis of de novo transcriptome assembly

期刊

SCIENCE CHINA-LIFE SCIENCES
卷 56, 期 2, 页码 156-162

出版社

SCIENCE PRESS
DOI: 10.1007/s11427-013-4444-x

关键词

transcriptome assembly; next-generation sequencing; RNA-Seq; De Bruijn graph; overlap graph

类别

资金

  1. National Center for Research Resources from the National Institutes of Health [5P20RR016471-12]
  2. National Institute of General Medical Sciences from the National Institutes of Health [8 P20 GM103442-12]
  3. Odegard School of Aerospace Sciences
  4. School of Medicine and Health Sciences at University of North Dakota

向作者/读者索取更多资源

The fast development of next-generation sequencing technology presents a major computational challenge for data processing and analysis. A fast algorithm, de Bruijn graph has been successfully used for genome DNA de novo assembly; nevertheless, its performance for transcriptome assembly is unclear. In this study, we used both simulated and real RNA-Seq data, from either artificial RNA templates or human transcripts, to evaluate five de novo assemblers, ABySS, Mira, Trinity, Velvet and Oases. Of these assemblers, ABySS, Trinity, Velvet and Oases are all based on de Bruijn graph, and Mira uses an overlap graph algorithm. Various numbers of RNA short reads were selected from the External RNA Control Consortium (ERCC) data and human chromosome 22. A number of statistics were then calculated for the resulting contigs from each assembler. Each experiment was repeated multiple times to obtain the mean statistics and standard error estimate. Trinity had relative good performance for both ERCC and human data, but it may not consistently generate full length transcripts. ABySS was the fastest method but its assembly quality was low. Mira gave a good rate for mapping its contigs onto human chromosome 22, but its computational speed is not satisfactory. Our results suggest that transcript assembly remains a challenge problem for bioinformatics society. Therefore, a novel assembler is in need for assembling transcriptome data generated by next generation sequencing technique.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据