4.6 Article

Efficient pedigree recording for fast population genetics simulation

期刊

PLOS COMPUTATIONAL BIOLOGY
卷 14, 期 11, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1006581

关键词

-

资金

  1. Sloan Foundation
  2. National Science Foundation [DBI-1262645]
  3. Wellcome Trust [100956/Z/13/Z]
  4. NIH [R01GM115564]
  5. USFWS
  6. Wellcome Trust [100956/Z/13/Z] Funding Source: Wellcome Trust

向作者/读者索取更多资源

In this paper we describe how to efficiently record the entire genetic history of a population in forwards-time, individual-based population genetics simulations with arbitrary breeding models, population structure and demography. This approach dramatically reduces the computational burden of tracking individual genomes by allowing us to simulate only those loci that may affect reproduction (those having non-neutral variants). The genetic history of the population is recorded as a succinct tree sequence as introduced in the software package msprime, on which neutral mutations can be quickly placed afterwards. Recording the results of each breeding event requires storage that grows linearly with time, but there is a great deal of redundancy in this information. We solve this storage problem by providing an algorithm to quickly 'simplify' a tree sequence by removing this irrelevant history for a given set of genomes. By periodically simplifying the history with respect to the extant population, we show that the total storage space required is modest and overall large efficiency gains can be made over classical forward-time simulations. We implement a general-purpose framework for recording and simplifying genealogical data, which can be used to make simulations of any population model more efficient. We modify two popular forwards-time simulation frameworks to use this new approach and observe efficiency gains in large, whole-genome simulations of one to two orders of magnitude. In addition to speed, our method for recording pedigrees has several advantages: (1) All marginal genealogies of the simulated individuals are recorded, rather than just genotypes. (2) A population of N individuals with M polymorphic sites can be stored in O(N log N+ M) space, making it feasible to store a simulation's entire final generation as well as its history. (3) A simulation can easily be initialized with a more efficient coalescent simulation of deep history. The software for recording and processing tree sequences is named tskit.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Ecology

System drift and speciation

Joshua S. Schiffman, Peter L. Ralph

Summary: Even if a species' phenotype remains unchanged, the underlying molecular mechanism may change. This study uses linear system theory to explore how the gene network supporting a conserved phenotype evolves, and how the exploration of distinct mechanisms can lead to reproductive incompatibility between independently evolving populations.

EVOLUTION (2022)

Article Genetics & Heredity

Efficient ancestry and mutation simulation with msprime 1.0

Franz Baumdicker, Gertjan Bisschop, Daniel Goldstein, Graham Gower, Aaron P. Ragsdale, Georgia Tsambos, Sha Zhu, Bjarki Eldon, E. Castedo Ellerman, Jared G. Galloway, Ariella L. Gladstein, Gregor Gorjanc, Bing Guo, Ben Jeffery, Warren W. Kretzschumar, Konrad Lohse, Michael Matschiner, Dominic Nelson, Nathaniel S. Pope, Consuelo D. Quinto-Cortes, Murillo F. Rodrigues, Kumar Saunack, Thibaut Sellinger, Kevin Thornton, Hugo van Kemenade, Anthony W. Wohns, Yan Wong, Simon Gravel, Andrew D. Kern, Jere Koskela, Peter L. Ralph, Jerome Kelleher

Summary: This article introduces the release of msprime 1.0 and summarizes its numerous features and advantages through a collaborative open source development model. Compared to specialized alternatives, msprime demonstrates excellent performance with faster speed and higher memory efficiency, making it a commonly used simulation tool in population genetics.

GENETICS (2022)

Article Multidisciplinary Sciences

A unified genealogy of modern and ancient genomes

Anthony Wilder Wohns, Yan Wong, Ben Jeffery, Ali Akbari, Swapan Mallick, Ron Pinhasi, Nick Patterson, David Reich, Jerome Kelleher, Gil McVean

Summary: The sequencing of modern and ancient genomes from around the world has revolutionized our understanding of human history and evolution. Although the problem of characterizing ancestral relationships from genomic variation remains unsolved, nonparametric methods have been used successfully to infer a unified genealogy of modern and ancient humans, identify descendants of ancient samples, and estimate geographical location of ancestors.

SCIENCE (2022)

Review Ecology

Conceptualizing ecosystem services using social-ecological networks

Maria R. Felipe-Lucia, Angela M. Guerrero, Steven M. Alexander, Jaime Ashander, Jacopo A. Baggio, Michele L. Barnes, Orjan Bodin, Aletta Bonn, Marie-Josee Fortin, Rachel S. Friedman, Jessica A. Gephart, Kate J. Helmstedt, Aislyn A. Keyes, Kailin Kroetz, Francois Massol, Michael J. O. Pocock, Jesse Sayles, Ross M. Thompson, Spencer A. Wood, Laura E. Dee

Summary: This article discusses the challenges and opportunities of using social-ecological networks (SENs) in ecosystem service research, and proposes a typology to represent ecosystem services in SENs. The typology provides guidance for improving research design and addressing a wider range of questions regarding human-nature interdependencies.

TRENDS IN ECOLOGY & EVOLUTION (2022)

Article Biochemical Research Methods

Bayesian inference of ancestral recombination graphs

Ali Mahmoudi, Jere Koskela, Jerome Kelleher, Yao-ban Chan, David Balding

Summary: This article presents a novel algorithm, ARGinfer, for probabilistic inference of the Ancestral Recombination Graph under the Coalescent with Recombination. The algorithm uses the Succinct Tree Sequence data structure and accurately estimates evolutionary history properties of the sample, providing interpretable uncertainty assessments through posterior probability distributions.

PLOS COMPUTATIONAL BIOLOGY (2022)

Article Ecology

The power of forecasts to advance ecological theory

Abigail S. L. Lewis, Christine R. Rollinson, Andrew J. Allyn, Jaime Ashander, Stephanie Brodie, Cole B. Brookson, Elyssa Collins, Michael C. Dietze, Amanda S. Gallinat, Noel Juvigny-Khenafou, Gerbrand Koren, Daniel J. McGlinn, Hassan Moustahfid, Jody A. Peters, Nicholas R. Record, Caleb J. Robbins, Jonathan Tonkin, Glenda M. Wardle

Summary: This article introduces a conceptual framework that describes how ecological forecasting can energize and advance ecological theory. The authors emphasize the potential for future progress through increased forecast development, comparison, and synthesis. They envision a future where forecasting is integrated as part of the toolset used in fundamental ecology, and aim to decrease barriers to entry and broaden the community of researchers using forecasting for fundamental ecological insight.

METHODS IN ECOLOGY AND EVOLUTION (2023)

Article Green & Sustainable Science & Technology

Guiding large-scale management of invasive species using network metrics

Jaime Ashander, Kailin Kroetzt, Rebecca Epanchin-Niell, Nicholas B. D. Phelps, Robert G. Haight, Laura E. Dee

Summary: Using network metrics to guide management can effectively address the challenges of biological invasions. The study evaluates the performance of network-guided invasive species management compared to optimal management and finds that the network-guided approach achieves high performance, even with incomplete information. This research highlights the potential of network approaches for sustainable resource management.

NATURE SUSTAINABILITY (2022)

Article Genetics & Heredity

Demes: a standard format for demographic models

Graham Gower, Aaron P. Ragsdale, Gertjan Bisschop, Ryan N. Gutenkunst, Matthew Hartfield, Ekaterina Noskova, Stephan Schiffels, Travis J. Struck, Jerome Kelleher, Kevin R. Thornton

Summary: Understanding the demographic history of populations is crucial in population genetics, but the lack of a standardized format to define population dynamic models hampers progress in the field. Therefore, we propose the Demes data model and file format to address these issues.

GENETICS (2022)

Article Multidisciplinary Sciences

On the genes, genealogies, and geographies of Quebec

Luke Anderson-Trocme, Dominic Nelson, Shadi Zabad, Alex Diaz-Papkovich, Ivan Kryukov, Nikolas Baya, Mathilde Touvier, Ben Jeffery, Christian Dina, Helene Vezina, Jerome Kelleher, Simon Gravel

Summary: Population genetic models provide coarse representations of real-world ancestry, but this study used a large pedigree and genotype data to finely model and trace French Canadian ancestry. The loss of ancestral population structure and the emergence of spatial and regional structure highlights various population expansion models. Migration, genetic, and genealogical patterns were found within river networks in different regions of Quebec. The study also provides a simulated whole-genome sequence dataset for investigating population genetics at a high resolution.

SCIENCE (2023)

Article Biology

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

M. Elise Lauterbur, Maria Izabel A. Cavassim, Ariella L. Gladstein, Graham Gower, Nathaniel S. Pope, Georgia Tsambos, Jeffrey Adrion, Saurabh Belsare, Arjun Biddanda, Victoria Caudill, Jean Cury, Ignacio Echevarria, Benjamin C. Haller, Ahmed R. Hasan, Xin Huang, Leonardo Nicola Martin Iasi, Ekaterina Noskova, Jana Obsteter, Vitor Antonio Correa Pavinato, Alice Pearson, David Peede, Manolo F. Perez, Murillo F. Rodrigues, Chris C. R. Smith, Jeffrey P. Spence, Anastasia Teterina, Silas Tittes, Per Unneberg, Juan Manuel Vazquez, Ryan K. Waples, Anthony Wilder Wohns, Yan Wong, Franz Baumdicker, Reed A. Cartwright, Gregor Gorjanc, Ryan N. Gutenkunst, Jerome Kelleher, Andrew D. Kern, Aaron P. Ragsdale, Peter L. Ralph, Daniel R. Schrider, Ilan Gronau

Summary: Simulation is crucial for population genetics research, but it remains a challenge to produce simulations that accurately represent genomic datasets. The development of more realistic simulations has become possible with advances in genetic data and simulation software. However, it still requires significant time and specialized knowledge.
Article Genetics & Heredity

Dispersal inference from population genetic variation using a convolutional neural network

Chris C. R. Smith, Silas Tittes, Peter L. Ralph, Andrew D. Kern

Summary: The geographic nature of biological dispersal shapes genetic variation patterns, allowing the estimation of dispersal properties from genetic data. This study presents a deep learning approach called disperseNN, which utilizes geographically distributed genotype data and convolutional neural network to estimate the mean per-generation dispersal distance. Through extensive simulations, disperseNN is shown to outperform or be competitive with existing methods, especially for small sample sizes. It also proves effective in estimating dispersal distance when other model parameters are unknown, without relying on local population density or accurate inference of identity-by-descent tracts.

GENETICS (2023)

Article Biochemical Research Methods

Enabling Inference for Context-Dependent Models of Mutation by Bounding the Propagation of Dependency

Frederick A. Matsen, Peter L. Ralph

Summary: This article presents a method for computing the likelihoods of genome mutations using bounds on the propagation of dependency. It also discusses protocols for examining residuals and iterative model refinement. The method provides an R package for efficient analysis and can be used to examine the context dependence of mutations.

JOURNAL OF COMPUTATIONAL BIOLOGY (2022)

Article Genetics & Heredity

Evaluating human autosomal loci for sexually antagonistic viability selection in two large biobanks

Katja R. Kasimatis, Abin Abraham, Peter L. Ralph, Andrew D. Kern, John A. Capra, Patrick C. Phillips

Summary: Sex and sexual differentiation are common in various species, leading to different selection pressures between sexes. However, studies on autosomal loci in humans did not find clear evidence of sexually antagonistic viability selection, possibly due to cross-hybridization with sex chromosome regions at these loci.

GENETICS (2021)

暂无数据