4.7 Article

Reading the Underlying Information From Massive Metagenomic Sequencing Data

Journal

PROCEEDINGS OF THE IEEE
Volume 105, Issue 3, Pages 459-473

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JPROC.2016.2604406

Keywords

Algorithms; biology; DNA; genetics; microorganisms; pattern recognition; sequences

Funding

  1. National Science Foundation of China (NSFC) [61561146396, 61673231]
  2. National Basic Research Program of China [2012CB316504]

Ask authors/readers for more resources

Microorganisms are everywhere. Recent studies showed that the mixture of microbes or the microbiome on the human body plays important roles in human physiology and diseases. Metagenomic sequencing is a key technology for studying microbiomes. It produces massive amounts of data in the form of short sequencing reads. A single metagenomic sample can contain 107 to 108 reads of about 100-nucleotide (nt) length each in a typical shotgun metagenomic sequencing study. They contain rich information about microbiomes and their functions, but reading out those information from the huge highly fragmented data has multiple challenges for mathematical models, bioinformatics methods, and computer algorithms. In this paper, we review the basic bioinformatics tasks and existing methods in processing and analyzing metagenomic data, and discuss remaining open challenges and practical observations. The aim of the paper is to provide readers a whole picture of metagenomic data processing and analysis, and a reference and perspective to start with for computational scientists who are interested in this exciting field.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available