4.7 Article

Hadoop-BAM: directly manipulating next generation sequencing data in the cloud

期刊

BIOINFORMATICS
卷 28, 期 6, 页码 876-877

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bts054

关键词

-

资金

  1. Finnish Funding Agency for Technology and Innovation Tekes
  2. Academy of Finland [139402]
  3. Academy of Finland (AKA) [139402, 139402] Funding Source: Academy of Finland (AKA)

向作者/读者索取更多资源

Hadoop-BAM is a novel library for the scalable manipulation of aligned next-generation sequencing data in the Hadoop distributed computing framework. It acts as an integration layer between analysis applications and BAM files that are processed using Hadoop. Hadoop-BAM solves the issues related to BAM data access by presenting a convenient API for implementing map and reduce functions that can directly operate on BAM records. It builds on top of the Picard SAM JDK, so tools that rely on the Picard API are expected to be easily convertible to support large-scale distributed processing. In this article we demonstrate the use of Hadoop-BAM by building a coverage summarizing tool for the Chipster genome browser. Our results show that Hadoop offers good scalability, and one should avoid moving data in and out of Hadoop between analysis steps.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

IoTEF: A Federated Edge-Cloud Architecture for Fault-Tolerant IoT Applications

Asad Javed, Jeremy Robert, Keijo Heljanko, Kary Framling

JOURNAL OF GRID COMPUTING (2020)

Article Computer Science, Artificial Intelligence

AlphaLogger: detecting motion-based side-channel attack using smartphone keystrokes

Abdul Rehman Javed, Mirza Omer Beg, Muhammad Asim, Thar Baker, Ali Hilal Al-Bayatti

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING (2020)

Article Engineering, Chemical

Data-Driven Approach to Grade Change Scheduling Optimization in a Paper Machine

Hossein Mostafaei, Teemu Ikonen, Jason Kramb, Tewodros Deneke, Keijo Heljanko, Iiro Harjunkoski

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH (2020)

Article Engineering, Chemical

Synergistic and Intelligent Process Optimization: First Results and Open Challenges

Iiro Harjunkoski, Teemu Ikonen, Hossein Mostafaei, Tewodros Deneke, Keijo Heljanko

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH (2020)

Article Biochemical Research Methods

A framework to assess the quality and impact of bioinformatics training across ELIXIR

Kim T. Gurwitz, Prakash Singh Gaur, Louisa J. Bellis, Lee Larcombe, Eva Alloza, Balint Laszlo Balint, Alexander Botzki, Jure Dimec, Victoria Dominguez del Angel, Pedro L. Fernandes, Eija Korpelainen, Roland Krause, Mateusz Kuzak, Loredana Le Pera, Brane Leskosek, Jessica M. Lindvall, Diana Marek, Paula A. Martinez, Tuur Muyldermans, Stale Nygard, Patricia M. Palagi, Hedi Peterson, Fotis Psomopoulos, Vojtech Spiwok, Celia W. G. van Gelder, Allegra Via, Marko Vidak, Daniel Wibberg, Sarah L. Morgan, Gabriella Rustici

PLOS COMPUTATIONAL BIOLOGY (2020)

Article Computer Science, Interdisciplinary Applications

Reinforcement learning of adaptive online rescheduling timing and computing time allocation

Teemu J. Ikonen, Keijo Heljanko, Iiro Harjunkoski

COMPUTERS & CHEMICAL ENGINEERING (2020)

Article Computer Science, Software Engineering

An optimal cut-off algorithm for parameterised refinement checking

Antti Siirtola, Keijo Heljanko

SCIENCE OF COMPUTER PROGRAMMING (2020)

Article Engineering, Chemical

Dynamic Process Intensification via Data-Driven Dynamic Optimization: Concept and Application to Ternary Distillation

Lingqing Yan, Tewodros L. Deneke, Keijo Heljanko, Iiro Harjunkoski, Thomas F. Edgar, Michael Baldea

Summary: Process intensification aims to make chemical processes safer and more efficient through significant modifications to design and structure. Dynamic process intensification (DPI) introduces operational changes to achieve the same product generation as steady-state operation, but with improved economics. The novel dynamic optimization-based DPI (Do-DPI) strategy involves true cyclic operation and can reduce energy use while maintaining product quality and production rate.

INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH (2021)

Article Multidisciplinary Sciences

Distributed hybrid-indexing of compressed pan-genomes for scalable and fast sequence alignment

Altti Ilari Maarala, Ossi Arasalo, Daniel Valenzuela, Veli Makinen, Keijo Heljanko

Summary: Computational pan-genomics analyzes information from multiple individual genomes to discover genetic variation thoroughly. With the rapid growth of whole-genome sequencing data, efficient data compression and indexing methods are crucial, especially for exploiting distributed and parallel computing more effectively.

PLOS ONE (2021)

Article Engineering, Chemical

Surrogate-based optimization of a periodic rescheduling algorithm

Teemu J. Ikonen, Keijo Heljanko, Iiro Harjunkoski

Summary: Periodic rescheduling is an iterative method used for real-time decision-making in industrial process operations. The design of such methods involves high-level decisions on when and how to schedule, with optimal choices depending on the operating environment. We propose the use of surrogate-based optimization to determine continuous control parameter choices, reducing computational costs.

AICHE JOURNAL (2022)

Proceedings Paper Computer Science, Hardware & Architecture

Progress in Certifying Hardware Model Checking Results

Emily Yu, Armin Biere, Keijo Heljanko

Summary: A formal framework was presented to certify k-induction-based model checking results, utilizing the concept of k-witness circuit and a simple inductive invariant. The approach reduces the certification problem to pure SAT checks and checking a simple QBF with one quantifier alternation in order to allow proofs to be checked with an independent proof checker. The resulting certification toolkit CERTIFAIGER was evaluated on instances from the hardware model checking competition, demonstrating the practical use of the certification method.

COMPUTER AIDED VERIFICATION, PT II, CAV 2021 (2021)

Proceedings Paper Computer Science, Software Engineering

BMC for Weak Memory Models: Relation Analysis for Compact SMT Encodings

Natalia Gavrilenko, Hernan Ponce-de-Leon, Florian Furbach, Keijo Heljanko, Roland Meyer

COMPUTER AIDED VERIFICATION, CAV 2019, PT I (2019)

Proceedings Paper Computer Science, Information Systems

Access Time Improvement Framework for Standardized IoT Gateways

Asad Javed, Narges Yousefnezhad, Jeremy Robert, Keijo Heljanko, Kary Framling

2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS) (2019)

Article Biochemical Research Methods

A global perspective on evolving bioinformatics and data science training needs

Teresa K. Attwood, Sarah Blackford, Michelle D. Brazas, Angela Davies, Maria Victoria Schneider

BRIEFINGS IN BIOINFORMATICS (2019)

Proceedings Paper Computer Science, Interdisciplinary Applications

BMC with Memory Models as Modules

Hernan Ponce-de-Leon, Florian Furbach, Keijo Heljanko, Roland Meyer

PROCEEDINGS OF THE 2018 18TH CONFERENCE ON FORMAL METHODS IN COMPUTER AIDED DESIGN (FMCAD) (2018)

暂无数据