4.8 Article

Towards the biogeography of prokaryotic genes

Journal

NATURE
Volume 601, Issue 7892, Pages 252-+

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41586-021-04233-4

Keywords

-

Funding

  1. European Union [686070: DD-DeCaF]
  2. Marie Skodowska-Curie Actions [713673]
  3. European Research Council (ERC) MicrobioS [ERC-AdG-669830]
  4. JTC project jumpAR [01KI1706]
  5. BMBF Grant [031L0181A: LAMarCK]
  6. European Molecular Biology Laboratory
  7. ETH
  8. Helmut Horten Foundation
  9. National Key R&D Program of China [2020YFA0712403]
  10. National Natural Science Foundation of China [61932008, 61772368, 31950410544]
  11. Shanghai Municipal Science and Technology Major Project [2018SHZDZX01]
  12. Zhangjiang Lab
  13. International Development Research Centre [109304]
  14. la Caixa Foundation [100010434, LCF/BQ/DI18/11660009]
  15. Severo Ochoa Program for Centres of Excellence in R&D from the Agencia Estatal de Investigacion of Spain [SEV-2016-0672]
  16. Ministerio de Ciencia, Innovacion y Universidades [PGC2018-098073-A-I00]
  17. Innovation Fund Denmark [4203-00005B]
  18. Biotechnology and Biological Sciences research Council (BBSrC) Institute Strategic Programme Gut Microbes and Health [BB/r012490/1, BBS/e/F/000Pr10355]

Ask authors/readers for more resources

The majority of microbial genes are specific to a single habitat, with a small fraction found in multiple habitats enriched in antibiotic-resistance genes and markers for mobile genetic elements. A small fraction of protein families contain the majority of genes, with most genetic variability observed within the families being neutral or nearly neutral.
Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats(1-3), little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority oft he genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available