4.3 Article

Optimizing data intensive GPGPU computations for DNA sequence alignment

期刊

PARALLEL COMPUTING
卷 35, 期 8-9, 页码 429-440

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.parco.2009.05.002

关键词

Short read mapping; GPGPU; Suffix trees; CUDA

资金

  1. National Institutes of Health [R01-LM006845, R01-GM083873]
  2. Direct For Computer & Info Scie & Enginr
  3. Div Of Information & Intelligent Systems [0844494] Funding Source: National Science Foundation

向作者/读者索取更多资源

MUMmerGPU uses highly-parallel commodity graphics processing units (GPU) to accelerate the data-intensive computation of aligning next generation DNA sequence data to a reference sequence for use in diverse applications such as disease genotyping and personal genomics. MUMmerGPU 2.0 features a new stackless depth-first-search print kernel and is 13 x faster than the serial CPU version of the alignment code and nearly 4x faster in total computation time than MUMmerGPU 1.0. We exhaustively examined 128 GPU data layout configurations to improve register footprint and running time and conclude higher occupancy has greater impact than reduced latency. MUMmerGPU is available open-source at http://www.mummergpu.sourceforge.net. (C) 2009 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemistry & Molecular Biology

Genome-wide patterns of transposon proliferation in an evolutionary young hybrid fish

Stefan Dennenmoser, Fritz J. Sedlazeck, Michael C. Schatz, Janine Altmueller, Matthias Zytnicki, Arne W. Nolte

MOLECULAR ECOLOGY (2019)

Article Biochemistry & Molecular Biology

De novo genome assembly of Candida glabrata reveals cell wall protein complement and structure of dispersed tandem repeat arrays

Zhuwei Xu, Brian Green, Nicole Benoit, Michael Schatz, Sarah Wheelan, Brendan Cormack

MOLECULAR MICROBIOLOGY (2020)

Article Biochemical Research Methods

Ribbon: intuitive visualization for complex genomic variation

Maria Nattestad, Robert Aboukhalil, Chen-Shan Chin, Michael C. Schatz

Summary: Ribbon is a visualization tool that shows the positioning of alignments in both the reference genome and read contexts, providing an intuitive view to better understand structural variants and their supporting read evidence. It was developed to curate complex structural variant calls and determine if they are well supported by long-read evidence, using the same intuitive visualization method for genome-to-genome comparisons.

BIOINFORMATICS (2021)

Article Biotechnology & Applied Microbiology

Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED

Sam Kovaka, Yunfan Fan, Bohan Ni, Winston Timp, Michael C. Schatz

Summary: UNCALLED is an open-source mapper that rapidly matches nanopore current signals to a reference sequence, enabling purely computational targeted sequencing. It has been used to deplete known bacterial genomes from a metagenomics community and enrich other species, as well as enrich human genes associated with hereditary cancers for accurate detection of genetic variations and modifications.

NATURE BIOTECHNOLOGY (2021)

Article Biology

iGenomics: Comprehensive DNA sequence analysis on your Smartphone

Aspyn Palatnick, Bin Zhou, Elodie Ghedin, Michael C. Schatz

GIGASCIENCE (2020)

Article Biochemistry & Molecular Biology

Optimized sample selection for cost-efficient long-read population sequencing

T. Rhyker Ranallo-Benavidez, Zachary Lemmon, Sebastian Soyk, Sergey Aganezov, William J. Salerno, Rajiv C. McCoy, Zachary B. Lippman, Michael C. Schatz, Fritz J. Sedlazeck

Summary: SVCollector is a tool for selecting optimal subsets for resequencing, which analyses population-level VCF files from the 1000 Genomes Project and 3000 Rice Genomes Project to choose individuals with higher representativeness and more genetic diversity. The tool solves this optimization problem using a fast, greedy heuristic and an exact algorithm with integer linear programming, and can estimate the changes in diversity present with different numbers of samples.

GENOME RESEARCH (2021)

Article Biochemistry & Molecular Biology

High resolution copy number inference in cancer using short-molecule nanopore sequencing

Timour Baslan, Sam Kovaka, Fritz J. Sedlazeck, Yanming Zhang, Robert Wappel, Sha Tian, Scott W. Lowe, Sara Goodwin, Michael C. Schatz

Summary: Genome copy number is a significant source of genetic variation in health and disease, with Copy Number Alterations (CNAs) being inferred from short-read sequencing data in cancer. The emerging Nanopore sequencing technologies offer potential for broader clinical utility, while short-molecule Nanopore sequencing can improve accuracy in CNA inference by increasing sequence read yield.

NUCLEIC ACIDS RESEARCH (2021)

Article Biology

Local adaptation and archaic introgression shape global diversity at human structural variant loci

Stephanie M. Yan, Rachel M. Sherman, Dylan J. Taylor, Divya R. Nair, Andrew N. Bortvin, Michael C. Schatz, Rajiv C. McCoy

Summary: This study utilized advanced techniques to uncover key evolutionary events in the human genome, revealing that a specific haplotype present in certain Southeast Asian populations can be traced back to Neanderthal gene flow.
Article Multidisciplinary Sciences

Pan-genomic matching statistics for targeted nanopore sequencing

Omar Ahmed, Massimiliano Rossi, Sam Kovaka, Michael C. Schatz, Travis Gagie, Christina Boucher, Ben Langmead

Summary: Nanopore sequencing is a powerful tool for genomics, and the novel method SPUMONI uses efficient pan-genome indexes to achieve rapid and accurate targeted sequencing. Compared to traditional methods, SPUMONI is faster and has a smaller memory footprint.

ISCIENCE (2021)

Article Biology

Artificial Intelligence and Cardiovascular Genetics

Chayakrit Krittanawong, Kipp W. Johnson, Edward Choi, Scott Kaplin, Eric Venner, Mullai Murugan, Zhen Wang, Benjamin S. Glicksberg, Christopher I. Amos, Michael C. Schatz, W. H. Wilson Tang

Summary: Polygenic diseases present unique challenges for diagnosis and management, but the advancements in AI and genomics offer unprecedented possibilities for personalized medicine.

LIFE-BASEL (2022)

Article Evolutionary Biology

Complete Sequence of a 641-kb Insertion of Mitochondrial DNA in the Arabidopsis thaliana Nuclear Genome

Peter D. Fields, Gus Waneka, Matthew Naish, Michael C. Schatz, Ian R. Henderson, Daniel B. Sloan

Summary: Intracellular transfers of mitochondrial DNA play a role in shaping nuclear genomes. Researchers have discovered a large nuclear insertion of mitochondrial DNA (numts) in Chromosome 2 of the model plant Arabidopsis thaliana. Using improved long-read sequencing technologies, they were able to determine the accurate sequence and structure of this numt, which is 641 kb in length. The study also found that the numt is transcriptionally inactive and has high levels of cytosine methylation.

GENOME BIOLOGY AND EVOLUTION (2022)

Article Biochemistry & Molecular Biology

Establishing Physalis as a Solanaceae model system enables genetic reevaluation of the inflated calyx syndrome

Jia He, Michael Alonge, Srividya Ramakrishnan, Matthias Benoit, Sebastian Soyk, Nathan T. Reem, Anat Hendelman, Joyce Van Eck, Michael C. Schatz, Zachary B. Lippman

Summary: The highly diverse Solanaceae family contains multiple widely studied models and crop species. This study focuses on the exploration of the diversity within the family, particularly in the genus Physalis. The researchers developed transformation and genome editing techniques in Physalis grisea and identified natural and engineered variations in floral traits. However, the study found that CRISPR-Cas9 targeted mutagenesis did not have an effect on the inflated calyx syndrome, a notable trait in the family. The study identified a mutation in an AP2-like gene causing the fusion of sepals and petals. These findings establish Physalis as a new model system in the Solanaceae family and provide insights into the factors driving the inflated calyx syndrome.

PLANT CELL (2023)

Article Biochemical Research Methods

Jasmine and Iris: population-scale structural variant comparison and analysis

Melanie Kirsche, Gautam Prabhu, Rachel Sherman, Bohan Ni, Alexis Battle, Sergey Aganezov, Michael C. C. Schatz

Summary: Jasmine and Iris, as fast and accurate tools, provide solutions for the comparison and analysis of structural variants (SVs) in a population. Jasmine outperforms six widely used comparison methods and identifies a set of high-confidence de novo SVs. Additionally, a unified callset of SVs and indels is provided for genotyping and assessing their impact on gene expression.

NATURE METHODS (2023)

Editorial Material Biochemical Research Methods

Approaching complete genomes, transcriptomes and epi-omes with accurate long-read sequencing

Sam Kovaka, Shujun Ou, Katharine M. Jenike, Michael C. Schatz

Summary: The year 2022 marks a significant milestone for accurate and fast long-read sequencing, offering competitive costs. This article discusses the crucial bioinformatics techniques required for empowering long reads in various applications and presents a vision for the future of long-read sequencing.

NATURE METHODS (2023)

Letter Cardiac & Cardiovascular Systems

National Human Genome Research Institute Genomic Data Science Analysis, Visualization, and Informatics Lab-Space: Reaching out to Clinicians

Jennifer L. Hall, Sally Honeycutt, Nicole Gonzalez, Anne O'Donnell-Luria, Casey Overby Taylor, Laura Stevens, Anthony A. Philippakis, Michael C. Schatz

CIRCULATION-GENOMIC AND PRECISION MEDICINE (2023)

暂无数据