Journal
MOLECULAR BIOLOGY AND EVOLUTION
Volume 37, Issue 10, Pages 3047-3060Publisher
OXFORD UNIV PRESS
DOI: 10.1093/molbev/msaa130
Keywords
linear-time gradient algorithm; random-effects molecular clock model; Bayesian inference; maximum likelihood
Funding
- European Research Council under the European Union's Horizon 2020 research and innovation program [725422]
- Wellcome Trust [206298/Z/17/Z]
- NSF [DMS 1264153]
- NIH [R01 AI107034, U19 AI135995]
- NIH-NIAID [K25AI153816]
- Interne Fondsen KU Leuven/Internal Funds KU Leuven [C14/18/094]
- Research Foundation-Flanders (Fonds voor Wetenschappelijk Onderzoek-Vlaanderen) [G066215N, G0D5117N, G0B9317N]
Ask authors/readers for more resources
Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O(N)-dimensional gradient calculations based on the standard pruning algorithm require O(N-2) operations, where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend toward even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make these analyses tractable, we present a linear-time algorithm for O(N)-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus, and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available