4.8 Article

Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon

Journal

NATURE COMMUNICATIONS
Volume 9, Issue -, Pages -

Publisher

NATURE PUBLISHING GROUP
DOI: 10.1038/s41467-018-06910-x

Keywords

-

Funding

  1. National Science Foundation (NSF) [DBI-ABI 0965596, IIS-1453527, IIS-1421908, CCF-1439057]
  2. Eberly College of Sciences at PSU
  3. Penn State Institute of Cyberscience
  4. National Center for Research Resources
  5. National Center for Advancing Translational Sciences, National Institutes of Health [UL1TR000127]
  6. Pennsylvania Department of Health using Tobacco Settlement and CURE funds
  7. Division of Computing and Communication Foundations
  8. Direct For Computer & Info Scie & Enginr [1439057] Funding Source: National Science Foundation

Ask authors/readers for more resources

A significant portion of genes in vertebrate genomes belongs to multigene families, with each family containing several gene copies whose presence/absence, as well as isoform structure, can be highly variable across individuals. Existing de novo techniques for assaying the sequences of such highly-similar gene families fall short of reconstructing end-to-end transcripts with nucleotide-level precision or assigning alternatively spliced transcripts to their respective gene copies. We present IsoCon, a high-precision method using long PacBio Iso-Seq reads to tackle this challenge. We apply IsoCon to nine Y chromosome ampliconic gene families and show that it outperforms existing methods on both experimental and simulated data. IsoCon has allowed us to detect an unprecedented number of novel isoforms and has opened the door for unraveling the structure of many multigene families and gaining a deeper understanding of genome evolution and human diseases.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available