☆ 4.7 Article

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

BIOINFORMATICS (2014)

期刊

BIOINFORMATICS

卷 30, 期 18, 页码 2652-2653

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bioinformatics/btu343

关键词

类别

Biochemical Research Methods Biotechnology & Applied Microbiology Computer Science, Interdisciplinary Applications Mathematical & Computational Biology Statistics & Probability

资金

Scientific Exchange Programme NMS-CH [12.289]
Swiss National Science Foundation [310000-116502]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

A Summary: Many time-consuming analyses of next-generation sequencing data can be addressed with modern cloud computing. The Apache Hadoop-based solutions have become popular in genomics because of their scalability in a cloud infrastructure. So far, most of these tools have been used for batch data processing rather than interactive data querying. The SparkSeq software has been created to take advantage of a new MapReduce framework, Apache Spark, for next-generation sequencing data. SparkSeq is a general-purpose, flexible and easily extendable library for genomic cloud computing. It can be used to build genomic analysis pipelines in Scala and run them in an interactive way. SparkSeq opens up the possibility of customized ad hoc secondary analyses and iterative machine learning algorithms. This article demonstrates its scalability and overall fast performance by running the analyses of sequencing datasets. Tests of SparkSeq also prove that the use of cache and HDFS block size can be tuned for the optimal performance on multiple worker nodes.

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision

期刊

BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文