4.7 Article

VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families

期刊

BIOINFORMATICS
卷 37, 期 13, 页码 1805-1813

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btab026

关键词

-

资金

  1. Ministerio de Ciencia e Innovacion (MCI)
  2. Agencia Estatal de investigacion (AEI)
  3. European Regional Development Funds (ERDF) [PGC2018-096956-B-C43]
  4. US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility [DE-AC02-05CH11231]
  5. DOE Office of Science [DE-AC02-05CH11231]

向作者/读者索取更多资源

The study focused on utilizing Viral Protein Families (VPFs) for the taxonomic classification and host prediction of uncultured viruses, developing an automated tool VPF-Class for classification of viral contigs. VPF-Class demonstrated high accuracy in both viral contig classification and host prediction, showcasing its potential for use in large metagenomics datasets.
Motivation: Two key steps in the analysis of uncultured viruses recovered from metagenomes are the taxonomic classification of the viral sequences and the identification of putative host(s). Both steps rely mainly on the assignment of viral proteins to orthologs in cultivated viruses. Viral Protein Families (VPFs) can be used for the robust identification of new viral sequences in large metagenomics datasets. Despite the importance of VPF information for viral discovery, VPFs have not yet been explored for determining viral taxonomy and host targets. Results: In this work, we classified the set of VPFs from the IMG/VR database and developed VPF-Class. VPF-Class is a tool that automates the taxonomic classification and host prediction of viral contigs based on the assignment of their proteins to a set of classified VPFs. Applying VPF-Class on 731K uncultivated virus contigs from the IMG/VR database, we were able to classify 363K contigs at the genus level and predict the host of over 461K contigs. In the RefSeq database, VPF-class reported an accuracy of nearly 100% to classify dsDNA, ssDNA and retroviruses, at the genus level, considering a membership ratio and a confidence score of 0.2. The accuracy in host prediction was 86.4%, also at the genus level, considering a membership ratio of 0.3 and a confidence score of 0.5. And, in the prophages dataset, the accuracy in host prediction was 86% considering a membership ratio of 0.6 and a confidence score of 0.8. Moreover, from the Global Ocean Virome dataset, over 817K viral contigs out of 1 million were classified.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Ecology

Dissecting the dominant hot spring microbial populations based on community-wide sampling at single-cell genomic resolution

Robert M. Bowers, Stephen Nayfach, Frederik Schulz, Sean P. Jungbluth, Ilona A. Ruhl, Andriy Sheremet, Janey Lee, Danielle Goudeau, Emiley A. Eloe-Fadrosh, Ramunas Stepanauskas, Rex R. Malmstrom, Nikos C. Kyrpides, Peter F. Dunfield, Tanja Woyke

Summary: Advancements in single-cell genomics have enabled rapid and affordable sequencing of microbial communities, providing a comprehensive snapshot of community composition and function. This approach also allows for the direct linkage of mobile elements to hosts and analysis of population heterogeneity among dominant community members.

ISME JOURNAL (2022)

Article Biochemistry & Molecular Biology

Deeplasmid: deep learning accurately separates plasmids from bacterial chromosomes

William B. Andreopoulos, Alexander M. Geller, Miriam Lucke, Jan Balewski, Alicia Clum, Natalia N. Ivanova, Asaf Levy

Summary: Plasmids are mobile genetic elements that play a key role in microbial ecology and evolution by mediating horizontal transfer of important genes. Deeplasmid is a deep learning tool that accurately identifies plasmids from bacterial chromosomes. It can predict the presence of novel plasmids with high reliability, demonstrating its ability to detect new genetic elements.

NUCLEIC ACIDS RESEARCH (2022)

Article Agronomy

Medicago root nodule microbiomes: insights into a complex ecosystem with potential candidates for plant growth promotion

Pilar Martinez-Hidalgo, Ethan A. Humm, David W. Still, Baochen Shi, Matteo Pellegrini, Gabriela de la Roca, Esteban Veliz, Maskit Maymon, Pierrick Bru, Marcel Huntemann, Alicia Clum, Krishnaveni Palaniappan, Neha Varghese, Supratim Mukherjee, T. B. K. Reddy, Chris Daum, Natalia N. Ivanova, Nikos C. Kyrpides, Nicole Shapiro, Emiley A. Eloe-Fadrosh, Ann M. Hirsch

Summary: By studying the legume nodule microbiome, researchers were able to identify potential Plant Growth-Promoting Bacteria from Medicago nodules. They isolated and characterized 51 bacterial strains, including Bacillus and Micromonospora, which showed growth-promoting activities in planta. The comparison of biodiversity between undomesticated and cultivated Medicago roots and nodules highlighted the potential of these microbes for sustainable agriculture.

PLANT AND SOIL (2022)

Article Biochemistry & Molecular Biology

Identifying candidate structured RNAs in CRISPR operons

Brayon J. Fremin, Nikos C. Kyrpides

Summary: In this study, a large-scale comparative genomics approach was used to predict 156 novel candidate structured RNAs from 36,111 CRISPR-Cas systems, some of which overlapped with coding genes. This highlights the importance of expanding the search windows in coding regions for the identification of novel structured RNAs.

RNA BIOLOGY (2022)

Article Microbiology

The Genome of the Acid Soil-Adapted Strain Rhizobium favelukesii OR191 Encodes Determinants for Effective Symbiotic Interaction With Both an Inverted Repeat Lacking Clade and a Phaseoloid Legume Host

Bertrand Eardly, Wan Adnawani Meor Osman, Julie Ardley, Jaco Zandberg, Margaret Gollagher, Peter van Berkum, Patrick Elia, Dora Marinova, Rekha Seshadri, T. B. K. Reddy, Natalia Ivanova, Amrita Pati, Tanja Woyke, Nikos Kyrpides, Matthys Loedolff, Damian W. Laird, Wayne Reeve

Summary: The study identified the genome features of R. favelukesii OR191 important for symbiotic interactions with Medicago and Phaseolus vulgaris, including acid adaptation loci, Nod factor synthesis genes, and nitrogen fixation genes. These findings provide insights into the genetic basis of nodulation requirements and symbiotic effectiveness with different hosts.

FRONTIERS IN MICROBIOLOGY (2022)

Article Ecology

The role of zinc in the adaptive evolution of polar phytoplankton

Naihao Ye, Wentao Han, Andrew Toseland, Yitao Wang, Xiao Fan, Dong Xu, Cock van Oosterhout, Igor Grigoriev, Alessandro Tagliabue, Jian Zhang, Yan Zhang, Jian Ma, Huan Qiu, Youxun Li, Xiaowen Zhang, Thomas Mock

Summary: This study reveals that polar microalgae have a higher demand for zinc due to elevated cellular levels of zinc-binding proteins. Zinc plays an important role in supporting photosynthetic growth in eukaryotic polar phytoplankton, which is critical for algal colonization of low-temperature polar oceans.

NATURE ECOLOGY & EVOLUTION (2022)

Article Biochemistry & Molecular Biology

Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9

Supratim Mukherjee, Dimitri Stamatis, Cindy Tianqing Li, Galina Ovchinnikova, Jon Bertsch, Jagadish Chandrabose Sundaramurthi, Mahathi Kandimalla, Paul A. Nicolopoulos, Alessandro Favognano, I-Min A. Chen, Nikos C. Kyrpides, T. B. K. Reddy

Summary: The Genomes OnLine Database (GOLD) continues to serve as a flagship genomic metadata repository, providing freely available projects and metadata for large-scale comparative genomics analysis. New features and components have been added in the latest GOLD v.9 version.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs

Benjamin D. Lee, Uri Neri, Simon Roux, Yuri I. Wolf, Antonio Pedro Camargo, Mart Krupovic, Peter Simmonds, Nikos Kyrpides, Uri Gophna, Valerian V. Dolja, Eugene V. Koonin

Summary: We developed a computational pipeline to identify viroid-like cccRNAs and found a 5-fold increase in the number of identified elements compared to previous studies. The presence of viroid-like cccRNAs in diverse transcriptomes and ecosystems suggests that their host range is broader than currently known.
Article Ecology

Plant microbiomes harbor potential to promote nutrient turnover in impoverished substrates of a Brazilian biodiversity hotspot

Antonio P. Camargo, Rafael S. C. de Souza, Juliana Jose, Isabel R. Gerhardt, Ricardo A. Dante, Supratim Mukherjee, Marcel Huntemann, Nikos C. Kyrpides, Marcelo F. Carazzolle, Paulo Arruda

Summary: The grassland ecosystem of Brazilian campos rupestres has low concentrations of phosphorus and nitrogen, yet supports a high plant diversity. This study explores the taxonomic profile and functional potential of microbial communities associated with two plant species of the campos rupestres. The results show that the soil and rock communities associated with these plants share a core group of efficient colonizers enriched in certain bacterial families. The microbial populations associated with plant roots have a genetic repertoire for organic compound intake, phosphorus and nitrogen turnover, highlighting their role in nutrient availability.

ISME JOURNAL (2023)

Article Biochemistry & Molecular Biology

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata

Antonio Pedro Camargo, Stephen Nayfach, I-Min A. Chen, Krishnaveni Palaniappan, Anna Ratner, Ken Chu, Stephan J. Ritter, T. B. K. Reddy, Supratim Mukherjee, Frederik Schulz, Lee Call, Russell Y. Neches, Tanja Woyke, Natalia N. Ivanova, Emiley A. Eloe-Fadrosh, Nikos C. Kyrpides, Simon Roux

Summary: Viruses play critical roles in all microbiomes and their genomic diversity and impacts on biological processes are extensively explored through metagenomics. IMG/VR is a platform providing access to a large collection of viral sequences along with functional annotation and metadata. The latest version, IMG/VR v4, contains over 15 million virus genomes and genome fragments.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

The IMG/M data management and analysis system v.7: content updates and new features

I-Min A. Chen, Ken Chu, Krishnaveni Palaniappan, Anna Ratner, Jinghua Huang, Marcel Huntemann, Patrick Hajek, Stephan J. Ritter, Cody Webb, Dongying Wu, Neha J. Varghese, T. B. K. Reddy, Supratim Mukherjee, Galina Ovchinnikova, Matt Nolan, Rekha Seshadri, Simon Roux, Axel Visel, Tanja Woyke, Emiley A. Eloe-Fadrosh, Nikos C. Kyrpides, Natalia N. Ivanova

Summary: The Integrated Microbial Genomes & Microbiomes system (IMG/M) at the Department of Energy Joint Genome Institute (JGI) provides support for comparative analysis of various genomes, metagenomes, and metatranscriptomes. It includes datasets from JGI, as well as imported datasets from public sources and user-submitted datasets. In recent years, efforts have been made to improve annotation pipeline, upgrade reference database versions, and add new analysis functionalities.

NUCLEIC ACIDS RESEARCH (2023)

Article Multidisciplinary Sciences

Exploring the expressiveness of abstract metabolic networks

Irene Garcia, Bessem Chouaia, Merce Llabres, Marta Simeoni

Summary: Metabolism is a complex network structure composed of interconnected chemical reactions. An abstract metabolic network, represented by metabolic pathways as nodes and shared compounds as edges, is a suitable model for large-scale comparison of organisms' metabolism. By using graph kernel methods, pairwise comparisons of abstract metabolic networks show that they can discriminate macro evolutionary events and capture key steps in metabolism evolution.

PLOS ONE (2023)

Article Mathematical & Computational Biology

Standardized naming of microbiome samples in Genomes OnLine Database

Supratim Mukherjee, Galina Ovchinnikova, Dimitri Stamatis, Cindy Tianqing Li, I-Min A. Chen, Nikos C. Kyrpides, T. B. K. Reddy

Summary: The power of next-generation sequencing has led to a massive increase in projects aiming to understand the diversity of complex microbial environments. However, the lack of standardized reporting standards for microbiome data and samples poses a challenge for follow-up studies. The Genomes OnLine Database (GOLD) has developed a standardized naming system for microbiome samples to address this issue and has continued to enrich the research community with well-curated and understandable names for metagenomes and metatranscriptomes. This naming system should be adopted as a best practice to improve the interoperability and reusability of microbiome data.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2023)

Article Cell Biology

Expanding the genomic encyclopedia of Actinobacteria with 824 isolate reference genomes

Rekha Seshadri, Simon Roux, Katharina J. Huber, Dongying Wu, Sora Yu, Dan Udwary, Lee Call, Stephen Nayfach, Richard L. Hahnke, Rudiger Pukall, James R. White, Neha J. Varghese, Cody Webb, Krishnaveni Palaniappan, Lorenz C. Reimer, Joaquim Sarda, Jonathon Bertsch, Supratim Mukherjee, T. B. K. Reddy, Patrick P. Hajek, Marcel Huntemann, I-Min A. Chen, Alex Spunde, Alicia Clum, Nicole Shapiro, Zong-Yen Wu, Zhiying Zhao, Yuguang Zhou, Lyudmila Evtushenko, Sofie Thijs, Vincent Stevens, Emiley A. Eloe-Fadrosh, Nigel J. Mouncey, Yasuo Yoshikuni, William B. Whitman, Hans-Peter Klenk, Tanja Woyke, Markus Goeker, Nikos C. Kyrpides, Natalia N. Ivanova

Summary: The study presents a comprehensive analysis of actinobacterial diversity, showing that only a small portion of this diversity is represented by sequenced genomes. The comparison of gene functions reveals novel determinants of host-microbe interaction and environment-specific adaptations. The analysis of biosynthetic gene clusters highlights the role of horizontal gene transfer and gene loss in shaping secondary metabolite repertoire.

CELL GENOMICS (2022)

暂无数据