4.6 Article

StrainSeeker: fast identification of bacterial strains from raw sequencing reads using user-provided guide trees

期刊

PEERJ
卷 5, 期 -, 页码 -

出版社

PEERJ INC
DOI: 10.7717/peerj.3353

关键词

k-mer; Clade; Strain identification; Species identification; Diagnostics

资金

  1. European Union through the European Regional Development Fund through Estonian Centre of Excellence in Genomics and Translational Medicine [2014-2020.4.01.15-0012]
  2. Estonian Ministry of Education and Research [IUT34-11, SF0180132s08, KOGU-HUMB]
  3. Baltic Antibiotic Resistance collaborative Network (BARN)
  4. Estonian Research Council [IUT34-19]
  5. Estonian Science Foundation [9059]
  6. ARMMD [3.2.0701.11-0013]

向作者/读者索取更多资源

Background: Fast, accurate and high-throughput identification of bacterial isolates is in great demand. The present work was conducted to investigate the possibility of identifying isolates from unassembled next-generation sequencing reads using custom-made guide trees. Results: A tool named StrainSeeker was developed that constructs a list of specific k-mers for each node of any given Newick-format tree and enables the identification of bacterial isolates in 1-2 min. It uses a novel algorithm, which analyses the observed and expected fractions of node-specific k-mers to test the presence of each node in the sample. This allows StrainSeeker to determine where the isolate branches off the guide tree and assign it to a clade whereas other tools assign each read to a reference genome. Using a dataset of 100 Escherichia coli isolates, we demonstrate that StrainSeeker can predict the clades of E. coli with 92% accuracy and correct tree branch assignment with 98% accuracy. Twenty-five thousand Illumina HiSeq reads are sufficient for identification of the strain. Conclusion: StrainSeeker is a software program that identifies bacterial isolates by assigning them to nodes or leaves of a custom-made guide tree. StrainSeeker's web interface and pre-computed guide trees are available at http://bioinfo. ut. ee/strainseeker. Source code is stored at GitHub: https://github. com/bioinfo-ut/StrainSeeker.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据