Journal
GENOMICS
Volume 110, Issue 6, Pages 375-381Publisher
ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.ygeno.2017.12.007
Keywords
Quasispecies; Clustering; Max K-cut; Next generation sequencing; RNA viruses
Funding
- National Science Foundation [CCF-1618427]
Ask authors/readers for more resources
RNA viruses are characterized by high mutation rates that give rise to populations of closely related genomes, known as viral quasispecies. Underlying heterogeneity enables the quasispecies to adapt to changing conditions and proliferate over the course of an infection. Determining genetic diversity of a virus (i.e., inferring haplotypes and their proportions in the population) is essential for understanding its mutation patterns, and for effective drug developments. Here, we present QSdpR, a method and software for the reconstruction of quasispecies from short sequencing reads. The reconstruction is achieved by solving a correlation clustering problem on a read-similarity graph and the results of the clustering are used to estimate frequencies of sub-species; the number of sub-species is determined using pseudo F index. Extensive tests on both synthetic datasets and experimental HIV-1 and Zika virus data demonstrate that QSdpR compares favorably to existing methods in terms of various performance metrics.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available