☆ 4.8 Article

Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics

MOLECULAR BIOLOGY AND EVOLUTION (2020)

Journal

MOLECULAR BIOLOGY AND EVOLUTION

Volume 37, Issue 10, Pages 3047-3060

Publisher

OXFORD UNIV PRESS

DOI: 10.1093/molbev/msaa130

Keywords

linear-time gradient algorithm; random-effects molecular clock model; Bayesian inference; maximum likelihood

Funding

European Research Council under the European Union's Horizon 2020 research and innovation program [725422]
Wellcome Trust [206298/Z/17/Z]
NSF [DMS 1264153]
NIH [R01 AI107034, U19 AI135995]
NIH-NIAID [K25AI153816]
Interne Fondsen KU Leuven/Internal Funds KU Leuven [C14/18/094]
Research Foundation-Flanders (Fonds voor Wetenschappelijk Onderzoek-Vlaanderen) [G066215N, G0D5117N, G0B9317N]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Calculation of the log-likelihood stands as the computational bottleneck for many statistical phylogenetic algorithms. Even worse is its gradient evaluation, often used to target regions of high probability. Order O(N)-dimensional gradient calculations based on the standard pruning algorithm require O(N-2) operations, where N is the number of sampled molecular sequences. With the advent of high-throughput sequencing, recent phylogenetic studies have analyzed hundreds to thousands of sequences, with an apparent trend toward even larger data sets as a result of advancing technology. Such large-scale analyses challenge phylogenetic reconstruction by requiring inference on larger sets of process parameters to model the increasing data heterogeneity. To make these analyses tractable, we present a linear-time algorithm for O(N)-dimensional gradient evaluation and apply it to general continuous-time Markov processes of sequence substitution on a phylogenetic tree without a need to assume either stationarity or reversibility. We apply this approach to learn the branch-specific evolutionary rates of three pathogenic viruses: West Nile virus, Dengue virus, and Lassa virus. Our proposed algorithm significantly improves inference efficiency with a 126- to 234-fold increase in maximum-likelihood optimization and a 16- to 33-fold computational performance increase in a Bayesian framework.

Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics

Journal

MOLECULAR BIOLOGY AND EVOLUTION

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Gradients Do Grow on Trees: A Linear-Time O(N)-Dimensional Gradient for Statistical Phylogenetics

Journal

MOLECULAR BIOLOGY AND EVOLUTION

Publisher

OXFORD UNIV PRESS

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper