4.8 Article

y Fast hierarchical Bayesian analysis of population structure

期刊

NUCLEIC ACIDS RESEARCH
卷 47, 期 11, 页码 5539-5549

出版社

OXFORD UNIV PRESS
DOI: 10.1093/nar/gkz361

关键词

-

资金

  1. Wellcome Trust [206194, 204016]
  2. ERC [742158]
  3. Alan Turing Institute via an Engineering and Physical Sciences Research Council [EP/510129/1]
  4. U.S. National Institutes of Health [R01AI135970]

向作者/读者索取更多资源

We present fastbaps, a fast solution to the genetic clustering problem. Fastbaps rapidly identifies an approximate fit to a Dirichlet process mixture model (DPM) for clustering multilocus genotype data. Our efficient model-based clustering approach is able to cluster datasets 10-100 times larger than the existing model-based methods, which we demonstrate by analyzing an alignment of over 110 000 sequences of HIV-1 pol genes. We also provide a method for rapidly partitioning an existing hierarchy in order to maximize the DPM model marginal likelihood, allowing us to split phylogenetic trees into clades and sub-clades using a population genomic model. Extensive tests on simulated data as well as a diverse set of real bacterial and viral datasets show that fastbaps provides comparable or improved solutions to previous model-based methods, while being significantly faster. The method is made freely available under an open source MIT licence as an easy to use R package at https://github.com/gtonkinhill/fastbaps.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Infectious Diseases

Database of epidemic trends and control measures during the first wave of COVID-19 in mainland China

Han Fu, Haowei Wang, Xiaoyue Xi, Adhiratha Boonyasiri, Yuanrong Wang, Wes Hinsley, Keith J. Fraser, Ruth McCabe, Daniela Olivera Mesa, Janetta Skarp, Alice Ledda, Tamsin Dewe, Amy Dighe, Peter Winskill, Sabine L. van Elsland, Kylie E. C. Ainslie, Marc Baguelin, Samir Bhatt, Olivia Boyd, Nicholas F. Brazeau, Lorenzo Cattarino, Giovanni Charles, Helen Coupland, Zulma M. Cucunuba, Gina Cuomo-Dannenburg, Christl A. Donnelly, Ilaria Dorigatti, Oliver D. Eales, Richard G. FitzJohn, Seth Flaxman, Katy A. M. Gaythorpe, Azra C. Ghani, William D. Green, Arran Hamlet, Katharina Hauck, David J. Haw, Benjamin Jeffrey, Daniel J. Laydon, John A. Lees, Thomas Mellan, Swapnil Mishra, Gemma Nedjati-Gilani, Pierre Nouvellet, Lucy Okell, Kris Parag, Manon Ragonnet-Cronin, Steven Riley, Nora Schmit, Hayley A. Thompson, H. Juliette T. Unwin, Robert Verity, Michaela A. C. Vollmer, Erik Volz, Patrick G. T. Walker, Caroline E. Walters, Oliver J. Watson, Charles Whittaker, Lilith K. Whittles, Natsuko Imai, Sangeeta Bhatia, Neil M. Ferguson

Summary: This study aimed to provide a comprehensive database describing the epidemic trends and responses during the first wave of COVID-19 in China. The results showed differences in epidemic trends and control measures among provinces, but focusing on testing and quarantine of inbound travelers helped sustain the control of the epidemic as local transmission declined.

INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES (2021)

Article Genetics & Heredity

A comprehensive and high-quality collection of Escherichia coli genomes and their genes

Gal Horesh, Grace A. Blackwell, Gerry Tonkin-Hill, Jukka Corander, Eva Heinz, Nicholas R. Thomson

Summary: Escherichia coli, a highly diverse organism with high genome plasticity and a large gene pool, is considered a priority pathogen due to observed drug resistance. Despite the abundance of available data, accessing and analyzing information remains a challenge. By curating a high-quality dataset of over 10,000 E. coli and Shigella genomes, this study aims to provide a foundation for future research on the biological differences between E. coli lineages and the distribution of genes in the population.

MICROBIAL GENOMICS (2021)

Article Multidisciplinary Sciences

Apparent nosocomial adaptation of Enterococcus faecalis predates the modern hospital era

Anna K. Pontinen, Janetta Top, Sergio Arredondo-Alonso, Gerry Tonkin-Hill, Ana R. Freitas, Carla Novais, Rebecca A. Gladstone, Maiju Pesonen, Rodrigo Meneses, Henri Pesonen, John A. Lees, Dorota Jamrozy, Stephen D. Bentley, Val F. Lanza, Carmen Torres, Luisa Peixe, Teresa M. Coque, Julian Parkhill, Anita C. Schurch, Rob J. L. Willems, Jukka Corander

Summary: The study reveals that Enterococcus faecalis is a commensal microorganism as well as a nosocomial pathogen, with common ancestors of multiple hospital-associated lineages dating back to the pre-antibiotic era.

NATURE COMMUNICATIONS (2021)

Article Immunology

International links between Streptococcus pneumoniae vaccine serotype 4 sequence type (ST) 801 in Northern European shipyard outbreaks of invasive pneumococcal disease

R. A. Gladstone, L. Siira, O. B. Brynildsrud, D. F. Vestrheim, P. Turner, S. C. Clarke, S. Srifuengfung, R. Ford, D. Lehmann, E. Egorova, E. Voropaeva, G. Haraldsson, K. G. Kristinsson, L. McGee, R. F. Breiman, S. D. Bentley, C. L. Sheppard, N. K. Fry, J. Corander, M. Toropainen, A. Steens

Summary: This study used genomics to investigate the international links between outbreaks of vaccine preventable serotype 4 sequence type 801 in shipyards in several countries. The findings suggest that the total diversity of ST801 within the outbreaks cannot be solely explained by recent transmission alone, indicating potential international transmission between shipyards.

VACCINE (2022)

Article Virology

Whole-genome analysis to determine the rate and patterns of intra-subtype reassortment among influenza type-A viruses in Africa

Grace Nabakooza, Andrzej Pastusiak, David Patrick Kateete, Julius Julian Lutwama, John Mulindwa Kitayimbwa, Simon David William

Summary: This study found a high frequency of intra-subtype reassortment events among influenza viruses in Africa, highlighting the importance of reassortment in the evolution and diversification of the virus. The spatial and temporal distribution patterns of reassortants suggest that Africa is part of the global influenza ecology. The findings emphasize the importance of routine whole-genome sequencing and analysis in monitoring the circulation of influenza viruses and detecting emerging viruses.

VIRUS EVOLUTION (2022)

Article Genetics & Heredity

Targeted control of pneumolysin production by a mobile genetic element in Streptococcus pneumoniae

Emily J. Stevens, Daniel J. Morse, Dora Bonini, Seana Duggan, Tarcisio Brignoli, Mario Recker, John A. Lees, Nicholas J. Croucher, Stephen Bentley, Daniel J. Wilson, Sarah G. Earle, Robert Dixon, Angela Nobbs, Howard Jenkinson, Tim van Opijnen, Derek Thibault, Oliver J. Wilkinson, Mark S. Dillingham, Simon Carlile, Rachel M. McLoughlin, Ruth C. Massey

Summary: Streptococcus pneumoniae is a major human pathogen that can cause severe invasive diseases such as pneumonia, septicaemia, and meningitis. The haemolytic toxin pneumolysin (Ply) is identified as a primary virulence factor for this bacterium, and a novel modular protein, ZomB, is found to regulate Ply activity and potentially influence bacterial colonization in the respiratory tract and lungs in mice. Additionally, the antibiotic resistance gene acquired on the ICE ICESp23FST81 is shown to play a role in controlling the expression of a major virulence factor, suggesting its importance in the success of S. pneumoniae lineages that acquire it.

MICROBIAL GENOMICS (2022)

Article Microbiology

Pneumococcal within-host diversity during colonization, transmission and treatment

Gerry Tonkin-Hill, Clare Ling, Chrispin Chaguza, Susannah J. Salter, Pattaraporn Hinfonthong, Elissavet Nikolaou, Natalie Tate, Andrzej Pastusiak, Claudia Turner, Claire Chewapreecha, Simon D. W. Frost, Jukka Corander, Nicholas J. Croucher, Paul Turner, Stephen D. Bentley

Summary: Characterizing the genetic diversity of Streptococcus pneumoniae through deep sequencing provides valuable insights into colonization dynamics, transmission, inter-strain competition, and the impact of antibiotic treatment.

NATURE MICROBIOLOGY (2022)

Article Biochemistry & Molecular Biology

Robust analysis of prokaryotic pangenome gene gain and loss rates with Panstripe

Gerry Tonkin-Hill, Rebecca A. Gladstone, Anna K. Pontinen, Sergio Arredondo-Alonso, Stephen D. Bentley, Jukka Corander

Summary: Horizontal gene transfer (HGT) is important for the evolution and diversification of microbial species. Existing methods for analyzing gene presence/absence patterns do not consider errors in annotation and clustering. The new method Panstripe, based on generalized linear regression, can effectively identify differences in HGT events by accounting for population structure and errors in gene prediction.

GENOME RESEARCH (2023)

Article Immunology

Use of Next-Generation Sequencing in a State-Wide Strategy of HIV-1 Surveillance: Impact of the SARS-COV-2 Pandemic on HIV-1 Diagnosis and Transmission

Shuntai Zhou, Nathan Long, Matt Moeser, Collin S. Hill, Erika Samoff, Victoria Mobley, Simon Frost, Cara Bayer, Elizabeth Kelly, Annalea Greifinger, Scott Shone, William Glover, Michael Clark, Joseph Eron, Myron Cohen, Ronald Swanstrom, Ann M. Dennis

Summary: The state-wide HIV recency surveillance system reveals that the SARS-CoV-2 pandemic has caused a delay in detecting recent HIV infection among people of color. Public health resources should prioritize restoring HIV-1 testing and preventing ongoing transmission.

JOURNAL OF INFECTIOUS DISEASES (2023)

Review Genetics & Heredity

Challenges in prokaryote pangenomics

Gerry Tonkin-Hill, Jukka Corander, Julian Parkhill

Summary: Horizontal gene transfer (HGT) and patterns of gene gain and loss are essential in bacterial evolution. Understanding these patterns can shed light on selection's role in the evolution of bacterial pangenomes and bacterial adaptation to new niches. However, predicting gene presence or absence can be prone to errors, which can complicate the study of HGT dynamics. This review discusses the challenges of constructing accurate pangenomes and the potential consequences of errors on downstream analyses, with the goal of improving bacterial pangenome analyses.

MICROBIAL GENOMICS (2023)

Article Biochemistry & Molecular Biology

Accurate and fast graph-based pangenome annotation and clustering with ggCaller

Samuel T. Horsfield, Gerry Tonkin-Hill, Nicholas J. Croucher, John A. Lees

Summary: ggCaller is a novel bacterial genome analysis tool that combines gene prediction, functional annotation, and clustering, resulting in more accurate predictions and clustering. It has considerable speed-ups and can handle complex sources of error. It is also useful for bacterial genome-wide association studies and functional analyses.

GENOME RESEARCH (2023)

Article Genetics & Heredity

Mge-cluster: a reference-free approach for typing bacterial plasmids

Sergio Arredondo-Alonso, Rebecca A. Gladstone, Anna K. Pontinen, Joao A. Gama, Anita C. Schurch, Val F. Lanza, Pal Jarle Johnsen, Orjan Samuelsen, Gerry Tonkin-Hill, Jukka Corander

Summary: Extrachromosomal elements of bacterial cells, such as plasmids, play a crucial role in evolution and adaptation. However, the classification of plasmids is still limited, which motivated the development of an efficient approach called mge-cluster that can recognize novel plasmid types and classify them into previously identified groups. This approach offers faster runtime, moderate memory usage, and an intuitive visualization, classification, and clustering scheme within a single framework. By analyzing a population-wide plasmid data set from Escherichia coli, the study highlights the prevalence of the colistin resistance gene and describes plasmid transmission in a hospital environment.

NAR GENOMICS AND BIOINFORMATICS (2023)

Article Virology

Model design for nonparametric phylodynamic inference and applications to pathogen surveillance

Xavier Didelot, Vinicius Franceschi, Simon D. W. Frost, Ann Dennis, Erik M. Volz

Summary: Inference of effective population size from genomic data can provide insights into demographic history and epidemiological dynamics. A nonparametric approach based on latent process models is developed to estimate the population size dynamics, optimizing parameters using out-of-sample prediction accuracy. The methodology is demonstrated using simulation experiments and applied to HIV-1 and SARS-CoV-2 datasets to estimate the impact of interventions on epidemic dynamics.

VIRUS EVOLUTION (2023)

Article Genetics & Heredity

Genome-wide association, prediction and heritability in bacteria with application to Streptococcus pneumoniae

Sudaraka Mallawaarachchi, Gerry Tonkin-Hill, Nicholas J. Croucher, Paul Turner, Doug Speed, Jukka Corander, David Balding

Summary: This study proposes methods for analyzing bacterial traits using whole-genome sequencing and applies them to analyze several phenotypes of Streptococcus pneumoniae. The results show high heritability and prediction accuracy for minimum inhibitory concentrations (MIC), revealing genetic associations and surprising findings regarding resistance. The study also suggests the moderate heritability and polygenic nature of within-host carriage duration.

NAR GENOMICS AND BIOINFORMATICS (2022)

暂无数据