4.7 Article

Predicting amphibian intraspecific diversity with machine learning: Challenges and prospects for integrating traits, geography, and genetic data

期刊

MOLECULAR ECOLOGY RESOURCES
卷 21, 期 8, 页码 2818-2831

出版社

WILEY
DOI: 10.1111/1755-0998.13303

关键词

Caudata; data repurposing; latitude; nucleotide diversity; phylogenetic signal; random forests

资金

  1. Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior [88881.170016/2018]
  2. Division of Biological Infrastructure [1910623]
  3. Div Of Biological Infrastructure
  4. Direct For Biological Sciences [1910623] Funding Source: National Science Foundation

向作者/读者索取更多资源

The combination of machine learning and genetic data sets holds great promise for addressing long-standing questions in ecology and evolution. Determining factors influencing intraspecific genetic diversity is challenging due to the multitude of potential influencing factors and their varying importance across taxonomic and geographic scales. Data repurposing and machine learning techniques can help identify key predictors of genetic diversity, offering valuable insights for conservation efforts.
The growing availability of genetic data sets, in combination with machine learning frameworks, offers great potential to answer long-standing questions in ecology and evolution. One such question has intrigued population geneticists, biogeographers, and conservation biologists: What factors determine intraspecific genetic diversity? This question is challenging to answer because many factors may influence genetic variation, including life history traits, historical influences, and geography, and the relative importance of these factors varies across taxonomic and geographic scales. Furthermore, interpreting the influence of numerous, potentially correlated variables is difficult with traditional statistical approaches. To address these challenges, we analysed repurposed data using machine learning and investigated predictors of genetic diversity, focusing on Nearctic amphibians as a case study. We aggregated species traits, range characteristics, and >42,000 genetic sequences for 299 species using open-access scripts and various databases. After identifying important predictors of nucleotide diversity with random forest regression, we conducted follow-up analyses to examine the roles of phylogenetic history, geography, and demographic processes on intraspecific diversity. Although life history traits were not important predictors for this data set, we found significant phylogenetic signal in genetic diversity within amphibians. We also found that salamander species at northern latitudes contained low genetic diversity. Data repurposing and machine learning provide valuable tools for detecting patterns with relevance for conservation, but concerted efforts are needed to compile meaningful data sets with greater utility for understanding global biodiversity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Ecology

Diversification with gene flow and niche divergence in a lizard species along the South American diagonal of open formations

Emanuel M. Fonseca, Marcelo Gehara, Fernanda P. Werneck, Flavia M. Lanna, Guarino R. Colli, Jack W. Sites, Miguel T. Rodrigues, Adrian A. Garda

JOURNAL OF BIOGEOGRAPHY (2018)

Article Biochemistry & Molecular Biology

The evolutionary history of Lygodactylus lizards in the South American open diagonal

Flavia Lanna, Fernanda Werneck, Marcelo Gehara, Emanuel Fonseca, Guarino Colli, Jack Sites, Miguel Rodrigues, Adrian Garda

MOLECULAR PHYLOGENETICS AND EVOLUTION (2018)

Article Biodiversity Conservation

The role of strict nature reserves in protecting genetic diversity in a semiarid vegetation in Brazil

Emanuel M. Fonseca, Fernanda P. Werneck, Marcelo Gehara, Eliana F. Oliveira, Felipe de M. Magalhaes, Flavia M. Lanna, Guilherme S. Lima, Ricardo Marques, Daniel O. Mesquita, Gabriel C. Costa, Guarino R. Colli, Adrian A. Garda

BIODIVERSITY AND CONSERVATION (2019)

Article Evolutionary Biology

Dwarf geckos and giant rivers: the role of the Sao Francisco River in the evolution of Lygodactylus klugei (Squamata: Gekkonidae) in the semi-arid Caatinga of north-eastern Brazil

Flavia M. Lanna, Marcelo Gehara, Fernanda P. Werneck, Emanuel M. Fonseca, Guarino R. Colli, Jack W. Sites, Miguel T. Rodrigues, Adrian A. Garda

BIOLOGICAL JOURNAL OF THE LINNEAN SOCIETY (2020)

Article Ecology

Evolutionary history of Neotropical savannas geographically concentrates species, phylogenetic and functional diversity of lizards

Jessica Fenker, Fabricius M. C. B. Domingos, Leonardo G. Tedeschi, Dan F. Rosauer, Fernanda P. Werneck, Guarino R. Colli, Roger M. D. Ledo, Emanuel M. Fonseca, Adrian A. Garda, Derek Tucker, Jack W. Sites, Maria F. Breitman, Flavia Soares, Lilian G. Giugliano, Craig Moritz

JOURNAL OF BIOGEOGRAPHY (2020)

Article Ecology

Isolation by environment and recurrent gene flow shaped the evolutionary history of a continentally distributed Neotropical treefrog

Felipe Camurugi, Marcelo Gehara, Emanuel M. Fonseca, Kelly R. Zamudio, Celio F. B. Haddad, Guarino R. Colli, Maria Tereza C. Thome, Cynthia P. A. Prado, Marcelo F. Napoli, Adrian A. Garda

Summary: This study reveals two genetic lineages of a widespread South American treefrog that diverged in the mid-Pleistocene with continued gene flow. The scenario of isolation with migration until the Last Glacial Maximum was supported, followed by recent population expansion in northeastern Brazil and stability in southwest South America. Isolation by environment was the best predictor of genetic distance between populations, indicating the importance of environmental niches in shaping genetic differentiation.

JOURNAL OF BIOGEOGRAPHY (2021)

Article Ecology

P2C2M.GMYC: An R package for assessing the utility of the Generalized Mixed Yule Coalescent model

Emanuel M. Fonseca, Drew J. Duckett, Bryan C. Carstens

Summary: The research team developed a software package to assess the fit of the GMYC model with empirical data, recommending its use when applying the GMYC method to identify unknown species boundaries. The package has modest computational requirements, is user-friendly, and can complete analyses on a typical laptop within minutes to an hour.

METHODS IN ECOLOGY AND EVOLUTION (2021)

Article Biochemistry & Molecular Biology

Phylogeographic model selection using convolutional neural networks

Emanuel M. Fonseca, Guarino R. Colli, Fernanda P. Werneck, Bryan C. Carstens

Summary: The discipline of phylogeography has rapidly advanced in utilizing a wide range of analytical tools for analyzing large genomic data sets. This study demonstrates the effectiveness of convolutional neural networks (CNNs) in accurately assessing demographic models in South American lizards, with a model accuracy exceeding 98% for all lineages. This highlights the potential of CNNs as a valuable addition to the phylogeographer's toolkit.

MOLECULAR ECOLOGY RESOURCES (2021)

Article Ecology

The riverine thruway hypothesis: rivers as a key mediator of gene flow for the aquatic paradoxical frog Pseudis tocantins (Anura, Hylidae)

Emanuel M. Fonseca, Adrian A. Garda, Eliana F. Oliveira, Felipe Camurugi, Felipe de M. Magalhaes, Flavia M. Lanna, Juan Pablo Zurano, Ricardo Marques, Miguel Vences, Marcelo Gehara

Summary: Rivers, landscape, and climate can alter patterns of gene flow and shape intraspecific genetic variation. The study on highly aquatic frog Pseudis tocantins in central Brazil found that genetic differentiation among localities is mostly explained by river connectivity, while elevation, slope, and climate have little impact. Migration patterns took place directionally from upstream to downstream sites.

LANDSCAPE ECOLOGY (2021)

Review Ecology

Assessing model adequacy leads to more robust phylogeographic inference

Bryan C. Carstens, Megan L. Smith, Drew J. Duckett, Emanuel M. Fonseca, M. Tereza C. Thome

Summary: Phylogeographic studies rely on complex demographic models and large data sets, but researchers face challenges in choosing appropriate models, interpreting results, and evaluating model fit. Currently, most attention is given to interpreting results, while model selection and evaluation are often overlooked.

TRENDS IN ECOLOGY & EVOLUTION (2022)

Article Multidisciplinary Sciences

Assessing model adequacy for Bayesian Skyline plots using posterior predictive simulation

Emanuel M. Fonseca, Drew J. Duckett, Filipe G. Almeida, Megan L. Smith, Maria Tereza C. Thome, Bryan C. Carstens

Summary: P2C2M.Skyline is a tool designed to evaluate model adequacy for BSPs, helping researchers detect model violations to avoid spurious results. It successfully identified model violations in simulated datasets and showed low false positive rates under simulated BSP models. The tool also performed well in empirical systems.

PLOS ONE (2022)

Article Evolutionary Biology

Pleistocene glaciations caused the latitudinal gradient of within-species genetic diversity

Emanuel M. Fonseca, Tara A. Pelletier, Sydney K. Decker, Danielle J. Parsons, Bryan C. Carstens

Summary: This study demonstrated that tropical species have higher levels of intraspecific genetic diversity compared to non-tropical species. Additionally, the data suggests that non-tropical species show deviations from neutral expectations, indicating historical population fluctuations possibly associated with Pleistocene glacial cycles. These findings suggest that Quaternary climate perturbations may play a more significant role in driving the latitudinal gradient in species richness than previously thought.

EVOLUTION LETTERS (2023)

Article Evolutionary Biology

Genetic diversity of North American vertebrates in protected areas

Coleen E. P. Thompson, Tara A. Pelletier, Bryan C. Carstens

Summary: The study compared genetic diversity inside and outside of protected areas in 44 vertebrate species and found that 48% of species showed significant differences in nucleotide diversity between the two areas. However, it remains unclear what factors influence the relative amount of genetic diversity in protected areas across different species.

BIOLOGICAL JOURNAL OF THE LINNEAN SOCIETY (2021)

暂无数据