4.7 Article

GalaxyTrakr: a distributed analysis tool for public health whole genome sequence data accessible to non-bioinformaticians

Journal

BMC GENOMICS
Volume 22, Issue 1, Pages -

Publisher

BMC
DOI: 10.1186/s12864-021-07405-8

Keywords

Galaxy; Biosurveillance; Whole genome sequencing; Food safety; Public health; GenomeTrakr; Genomic surveillance

Funding

  1. Center for Food Safety and Applied Nutrition at the U.S. Food and Drug Administration

Ask authors/readers for more resources

GalaxyTrakr is a customized instance of the Galaxy platform designed for laboratory scientists conducting food safety regulatory research. It provides tools for quality assessment, linking clinical isolates with food/environmental samples, and exploring new methodologies like metagenomics. With over 600 registered users and 450,000 analytical jobs processed, it promotes collaboration across public health laboratories and supports consistent interpretation of results.
Background: Processing and analyzing whole genome sequencing (WGS) is computationally intense: a single Illumina MiSeq WGS run produces similar to 1 million 250-base-pair reads for each of 24 samples. This poses significant obstacles for smaller laboratories, or laboratories not affiliated with larger projects, which may not have dedicated bioinformatics staff or computing power to effectively use genomic data to protect public health. Building on the success of the cloud-based Galaxy bioinformatics platform (http://galaxyproject.org), already known for its user-friendliness and powerful WGS analytical tools, the Center for Food Safety and Applied Nutrition (CFSAN) at the U.S. Food and Drug Administration (FDA) created a customized 'instance' of the Galaxy environment, called GalaxyTrakr (https://www.galaxytrakr.org), for use by laboratory scientists performing food-safety regulatory research. The goal was to enable laboratories outside of the FDA internal network to (1) perform quality assessments of sequence data, (2) identify links between clinical isolates and positive food/environmental samples, including those at the National Center for Biotechnology Information sequence read archive (https://www.ncbi.nlm.nih.gov/sra/), and (3) explore new methodologies such as metagenomics. GalaxyTrakr hosts a variety of free and adaptable tools and provides the data storage and computing power to run the tools. These tools support coordinated analytic methods and consistent interpretation of results across laboratories. Users can create and share tools for their specific needs and use sequence data generated locally and elsewhere. Results: In its first full year (2018), GalaxyTrakr processed over 85,000 jobs and went from 25 to 250 users, representing 53 different public and state health laboratories, academic institutions, international health laboratories, and federal organizations. By mid-2020, it has grown to 600 registered users and processed over 450,000 analytical jobs. To illustrate how laboratories are making use of this resource, we describe how six institutions use GalaxyTrakr to quickly analyze and review their data. Instructions for participating in GalaxyTrakr are provided. Conclusions: GalaxyTrakr advances food safety by providing reliable and harmonized WGS analyses for public health laboratories and promoting collaboration across laboratories with differing resources. Anticipated enhancements to this resource will include workflows for additional foodborne pathogens, viruses, and parasites, as well as new tools and services.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available