Article
Biology
Brandon Legried, Jonathan Terhorst
Summary: Recent theoretical work reveals differing perspectives on the estimation of phylogenetic birth-death models with lineage-through-time data. While Louca and Pennell (2020) argue that models with continuously differentiable rate functions are nonidentifiable, Legried and Terhorst (2022) show that identifiability can be restored by considering piecewise constant rate functions. In this study, we contribute new theoretical results to this ongoing discussion. We prove that models based on piecewise polynomial rate functions, regardless of the order or number of pieces, are statistically identifiable, including spline-based models with arbitrary knots. However, we also highlight the challenge of rate function estimation, even when identifiability is achieved, by presenting information-theoretic lower bounds for hypothesis testing using birth-death models.
JOURNAL OF THEORETICAL BIOLOGY
(2023)
Article
Biochemistry & Molecular Biology
Leo A. Featherstone, Sebastian Duchene, Timothy G. Vaughan
Summary: Despite its increasing role in understanding infectious disease transmission, phylodynamics lacks clarity on ideal data and optimal sampling. This study introduces a method to visualize and quantify the impact of pathogen genome sequence and sampling times on phylodynamic inference. By applying the method to simulated and real-world data, the study provides insights and guidelines for maximizing the use of sequence data in phylodynamic analyses. The continued research on phylodynamic data and inference is crucial for targeted and efficient responses to infectious disease threats.
MOLECULAR BIOLOGY AND EVOLUTION
(2023)
Article
Virology
Anthony Lam, Sebastian Duchene
Summary: Phylodynamic inference is crucial in understanding the transmission dynamics of viral outbreaks, with estimating the molecular evolutionary rate being an essential first step. The birth-death model outperforms the coalescent exponential model in estimating epidemiological parameters when faced with low diversity sequence data, while the coalescent model requires additional samples and sequence variability for accurate estimates. This was supported by empirical data analyses of SARS-CoV-2 outbreaks in Australia and New Zealand, emphasizing the importance of considering the birth-death model for future viral outbreak investigations with low sequence diversity.
Article
Biochemistry & Molecular Biology
Caleb Ki, Jonathan Terhorst
Summary: This paper introduces VBSKY, a method for fitting Bayesian phylodynamic models to large pathogen genetic datasets. By combining recent advances in modeling, inference, and programming, VBSKY can analyze thousands of genomes in minutes and provide accurate estimates of epidemiologically relevant quantities.
MOLECULAR BIOLOGY AND EVOLUTION
(2022)
Article
Oncology
Antonia Chroni, Sayaka Miura, Lauren Hamilton, Tracy Vu, Stephen G. Gaffney, Vivian Aly, Sajjad Karim, Maxwell Sanderford, Jeffrey P. Townsend, Sudhir Kumar
Summary: By analyzing the genetic heterogeneity of tumors in cancer patients, we found that the migration histories of metastasis are often best described by a hybrid model of metastatic tumor evolution. We discovered that new tumor seedings arise from clones of pre-existing metastases as frequently as they do from clones from primary tumors. Additionally, there were many clone exchanges between the source and recipient tumors.
Review
Virology
Leo A. Featherstone, Joshua M. Zhang, Timothy G. Vaughan, Sebastian Duchene
Summary: This article reviews phylodynamic models, including foundational models and assumptions, and provides working knowledge for public health researchers, epidemiologists, and biologists. It explores the links between evolutionary models and epidemiological models, along with statistical inference methods, and highlights future directions.
Article
Evolutionary Biology
Ailene MacPherson, Stilianos Louca, Angela McLaughlin, Jeffrey B. Joy, Matthew W. Pennell
Summary: Birth-death stochastic processes are fundamental to many phylogenetic models, but there are various model variants that are difficult to understand and derive. This paper unifies these models into a single framework, showing their relationships and providing a straightforward procedure for deriving likelihood functions for complex models.
SYSTEMATIC BIOLOGY
(2022)
Article
Ecology
Leo A. Featherstone, Francesca Di Giallonardo, Edward C. Holmes, Timothy G. Vaughan, Sebastian Duchene
Summary: The article discusses incorporating un-sequenced case occurrence data alongside sequenced data in Phylodynamic analysis, demonstrating through simulations that this approach can eliminate bias in estimates of the basic reproductive number due to misspecification of the sampling process. Additionally, it emphasizes that occurrence data are a valuable source of information for future Phylodynamic analyses.
METHODS IN ECOLOGY AND EVOLUTION
(2021)
Article
Multidisciplinary Sciences
Bjorn T. Kopperud, Andrew F. Magee, Sebastian Hohna
Summary: This study examines the congruence class in phylogenies of exclusively extant taxa and concludes that strong directional trends in speciation and extinction rates are robustly inferred, while estimates of constant rates or gentle slopes are not reliable. The valid space for speciation rates is narrower and more constrained compared to extinction rates, providing further evidence that speciation rates can be estimated more accurately than extinction rates.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2023)
Article
Biochemistry & Molecular Biology
Ugne Stolz, Tanja Stadler, Nicola F. Mueller, Timothy G. Vaughan
Summary: The study introduces an extended structured coalescent method to investigate migration patterns between viral subpopulations, particularly for segmented viruses that can undergo reassortment. Through simulated data analysis, this method can accurately estimate subpopulation dependent effective population sizes, reassortment, and migration rates. Additionally, analyses of avian influenza A/H5N1 sequences show that accounting for segment reassortment and using sequencing data from multiple viral segments for joint phylodynamic inference leads to different estimates.
MOLECULAR BIOLOGY AND EVOLUTION
(2022)
Article
Virology
Carlo Pacioni, Robyn N. Hall, Tanja Strive, David S. L. Ramsey, Mandev S. Gill, Timothy G. Vaughan
Summary: Since their introduction in 1859, European rabbits (Oryctolagus cuniculus) have had a devastating impact on agricultural production and biodiversity in Australia. Biocontrol agents, particularly the rabbit haemorrhagic disease virus 1 (RHDV1), are important control strategies for rabbits. By analyzing RHDV molecular data, we found that RHDV1 and RHDV2 have similar dynamics since their release, with RHDV2 having a shorter timeframe. There is also evidence of competition between the two viruses.
Article
Biochemistry & Molecular Biology
Stilianos Louca, Angela McLaughlin, Ailene MacPherson, Jeffrey B. Joy, Matthew W. Pennell
Summary: Viral phylogenies are crucial for understanding infectious disease spread, but extracting information from them is complex due to uncertainties and sampling issues. Current methods may not reliably reconstruct the true epidemiological dynamics, requiring further research and strategies.
MOLECULAR BIOLOGY AND EVOLUTION
(2021)
Article
Virology
Jeremie Scire, Joelle Barido-Sottani, Denise Kuehnert, Timothy G. Vaughan, Tanja Stadler
Summary: The multi-type birth-death model with sampling is an evolution dynamic model that quantifies past population dynamics in structured populations based on phylogenetic trees, implemented using the bdmm package. Important algorithmic changes to bdmm allows for the analysis of more genetic samples, improving numerical robustness and efficiency, leading to increased precision of parameter estimates, particularly for structured models with a high number of inferred parameters.
Article
Ecology
Sebastian Hoehna, Bjorn T. Kopperud, Andrew F. Magee
Summary: Diversification rates inferred from phylogenies are not identifiable due to the variability of speciation and extinction rates. However, congruence classes can be used to explore common features and assess the robustness of diversification rate patterns.
METHODS IN ECOLOGY AND EVOLUTION
(2022)
Article
Virology
Lenora Kepler, Marco Hamins-Puertolas, David A. Rasmussen
Summary: This study explores how genetic and non-genetic factors have influenced the fitness of SARS-CoV-2, with a particular focus on the impact of these factors on viral transmission dynamics in the United States. The findings reveal that before September 2020, spatial heterogeneity in transmission rates across geographic regions explained most of the fitness variation among pathogen lineages, while genetic variation in fitness became more prominent in late 2020 with the emergence of new lineages such as B.1.1.7, B.1.427, B.1.429, and B.1.526. The analysis also suggests that genetic variants in genomic regions outside of the Spike protein may play a significant role in overall fitness variation within the viral population.
Article
Genetics & Heredity
Pier Francesco Palamara, Jonathan Terhorst, Yun S. Song, Alkes L. Price
Article
Statistics & Probability
Jack Kamm, Jonathan Terhorst, Richard Durbin, Yun S. Song
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2020)
Article
Genetics & Heredity
Enes Dilber, Jonathan Terhorst
Summary: This article presents a new likelihood-based method for detecting natural selection that is robust to confounding factors such as population expansion. The method uses a probabilistic model of tree imbalance and a frequency spectrum-based estimator to detect signals of natural selection.
Article
Biochemistry & Molecular Biology
Caleb Ki, Jonathan Terhorst
Summary: This paper introduces VBSKY, a method for fitting Bayesian phylodynamic models to large pathogen genetic datasets. By combining recent advances in modeling, inference, and programming, VBSKY can analyze thousands of genomes in minutes and provide accurate estimates of epidemiologically relevant quantities.
MOLECULAR BIOLOGY AND EVOLUTION
(2022)
Article
Biochemistry & Molecular Biology
Iain Mathieson, Jonathan Terhorst
Summary: This study developed a novel method for estimating time-varying selection coefficients using ancient DNA data. Applying this method to ancient and present-day human genomes from Britain, the researchers identified seven loci with significant evidence of selection in the past 4500 years, mostly related to increased vitamin D or calcium levels. The strength of selection on individual loci varied over time, suggesting the influence of cultural or environmental factors. Skin pigmentation was the only complex trait with significant evidence of polygenic selection, emphasizing the importance of phenotypes related to vitamin D.
Article
Ecology
Brandon Legried, Jonathan Terhorst
Summary: In this paper, the statistical performance of demographic inference methods in the presence of continuous migration between populations is investigated. The theories of phase-type distributions and concentration of measure are employed to study the two-island and isolation-with-migration models, resulting in upper and lower bounds on rates of convergence for parametric estimators in migration models.
THEORETICAL POPULATION BIOLOGY
(2022)
Article
Biology
Subha Maity, Diptavo Dutta, Jonathan Terhorst, Yuekai Sun, Moulinath Banerjee
Summary: We propose new models and methods for the posterior drift problem, where the regression function in the target domain is modeled as a linear adjustment of that in the source domain. We study the theoretical properties of our estimators in the binary classification problem. Our approach is flexible and applicable in various statistical settings, including epidemiology, genetics, and biomedicine. We illustrate the power of our approach through mortality prediction for British Asians and overcoming spurious correlation in the Waterbirds dataset.
Article
Statistics & Probability
Caleb Ki, Jonathan Terhorst
Summary: Sequentially Markov coalescent (SMC) is an important family of models in statistical genetics for approximating genetic variation data distribution under complex evolutionary models. SMC-based methods are widely used in genetics and evolutionary biology for genotype phasing and imputation, recombination rate estimation, and population history inference. In this work, a method is proposed that enables SMC-based inference in a continuous state space without the need for discretization, making it faster and more accurate than existing methods.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Article
Biology
Brandon Legried, Jonathan Terhorst
Summary: Recent theoretical work reveals differing perspectives on the estimation of phylogenetic birth-death models with lineage-through-time data. While Louca and Pennell (2020) argue that models with continuously differentiable rate functions are nonidentifiable, Legried and Terhorst (2022) show that identifiability can be restored by considering piecewise constant rate functions. In this study, we contribute new theoretical results to this ongoing discussion. We prove that models based on piecewise polynomial rate functions, regardless of the order or number of pieces, are statistically identifiable, including spline-based models with arbitrary knots. However, we also highlight the challenge of rate function estimation, even when identifiability is achieved, by presenting information-theoretic lower bounds for hypothesis testing using birth-death models.
JOURNAL OF THEORETICAL BIOLOGY
(2023)