3.9 Article

Toward a standard in structural genome annotation for prokaryotes

期刊

STANDARDS IN GENOMIC SCIENCES
卷 10, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/s40793-015-0034-9

关键词

-

资金

  1. US Department of Energy's Office of Science, Biological and Environmental Research Program
  2. University of California, Lawrence Berkeley National Laboratory [DE-AC02-05CH11231]

向作者/读者索取更多资源

Background: In an effort to identify the best practice for finding genes in prokaryotic genomes and propose it as a standard for automated annotation pipelines, 1,004,576 peptides were collected from various publicly available resources, and were used as a basis to evaluate various gene-calling methods. The peptides came from 45 bacterial replicons with an average GC content from 31 % to 74 %, biased toward higher GC content genomes. Automated, manual, and semi-manual methods were used to tally errors in three widely used gene calling methods, as evidenced by peptides mapped outside the boundaries of called genes. Results: We found that the consensus set of identical genes predicted by the three methods constitutes only about 70 % of the genes predicted by each individual method (with start and stop required to coincide). Peptide data was useful for evaluating some of the differences between gene callers, but not reliable enough to make the results conclusive, due to limitations inherent in any proteogenomic study. Conclusions: A single, unambiguous, unanimous best practice did not emerge from this analysis, since the available proteomics data were not adequate to provide an objective measurement of differences in the accuracy between these methods. However, as a result of this study, software, reference data, and procedures have been better matched among participants, representing a step toward a much-needed standard. In the absence of sufficient amount of exprimental data to achieve a universal standard, our recommendation is that any of these methods can be used by the community, as long as a single method is employed across all datasets to be compared.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.9
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemistry & Molecular Biology

Identifying candidate structured RNAs in CRISPR operons

Brayon J. Fremin, Nikos C. Kyrpides

Summary: In this study, a large-scale comparative genomics approach was used to predict 156 novel candidate structured RNAs from 36,111 CRISPR-Cas systems, some of which overlapped with coding genes. This highlights the importance of expanding the search windows in coding regions for the identification of novel structured RNAs.

RNA BIOLOGY (2022)

Article Microbiology

The Genome of the Acid Soil-Adapted Strain Rhizobium favelukesii OR191 Encodes Determinants for Effective Symbiotic Interaction With Both an Inverted Repeat Lacking Clade and a Phaseoloid Legume Host

Bertrand Eardly, Wan Adnawani Meor Osman, Julie Ardley, Jaco Zandberg, Margaret Gollagher, Peter van Berkum, Patrick Elia, Dora Marinova, Rekha Seshadri, T. B. K. Reddy, Natalia Ivanova, Amrita Pati, Tanja Woyke, Nikos Kyrpides, Matthys Loedolff, Damian W. Laird, Wayne Reeve

Summary: The study identified the genome features of R. favelukesii OR191 important for symbiotic interactions with Medicago and Phaseolus vulgaris, including acid adaptation loci, Nod factor synthesis genes, and nitrogen fixation genes. These findings provide insights into the genetic basis of nodulation requirements and symbiotic effectiveness with different hosts.

FRONTIERS IN MICROBIOLOGY (2022)

Article Ecology

The role of zinc in the adaptive evolution of polar phytoplankton

Naihao Ye, Wentao Han, Andrew Toseland, Yitao Wang, Xiao Fan, Dong Xu, Cock van Oosterhout, Igor Grigoriev, Alessandro Tagliabue, Jian Zhang, Yan Zhang, Jian Ma, Huan Qiu, Youxun Li, Xiaowen Zhang, Thomas Mock

Summary: This study reveals that polar microalgae have a higher demand for zinc due to elevated cellular levels of zinc-binding proteins. Zinc plays an important role in supporting photosynthetic growth in eukaryotic polar phytoplankton, which is critical for algal colonization of low-temperature polar oceans.

NATURE ECOLOGY & EVOLUTION (2022)

Article Microbiology

Sodalis ligni Strain 159R Isolated from an Anaerobic Lignin-Degrading Consortium

Gina Chaput, Jacob Ford, Lani DeDiego, Achala Narayanan, Wing Yin Tam, Meghan Whalen, Marcel Huntemann, Alicia Clum, Alex Spunde, Manoj Pillay, Krishnaveni Palaniappan, Neha Varghese, Natalia Mikhailova, I-Min Chen, Dimitrios Stamatis, T. B. K. Reddy, Ronan O'Malley, Chris Daum, Nicole Shapiro, Natalia Ivanova, Nikos C. Kyrpides, Tanja Woyke, Tijana Glavina del Rio, Kristen M. DeAngelis

Summary: In this study, a novel bacterium strain 159R belonging to the genus Sodalis was successfully isolated from temperate forest soil. It has the capability to depolymerize lignin and can survive in anaerobic conditions. Its application potential in lignocellulosic biofuel production is promising.

MICROBIOLOGY SPECTRUM (2022)

Article Cell Biology

Thousands of small, novel genes predicted in global phage genomes

Brayon J. Fremin, Ami S. Bhatt, Nikos C. Kyrpides

Summary: This study used a large-scale comparative genomics approach to discover that small genes are more prevalent in phage genomes than in host prokaryotic genomes. These small genes may have important functions, such as encoding anti-CRISPR proteins and antimicrobial proteins.

CELL REPORTS (2022)

Article Biochemistry & Molecular Biology

Twenty-five years of Genomes OnLine Database (GOLD): data updates and new features in v.9

Supratim Mukherjee, Dimitri Stamatis, Cindy Tianqing Li, Galina Ovchinnikova, Jon Bertsch, Jagadish Chandrabose Sundaramurthi, Mahathi Kandimalla, Paul A. Nicolopoulos, Alessandro Favognano, I-Min A. Chen, Nikos C. Kyrpides, T. B. K. Reddy

Summary: The Genomes OnLine Database (GOLD) continues to serve as a flagship genomic metadata repository, providing freely available projects and metadata for large-scale comparative genomics analysis. New features and components have been added in the latest GOLD v.9 version.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

Mining metatranscriptomes reveals a vast world of viroid-like circular RNAs

Benjamin D. Lee, Uri Neri, Simon Roux, Yuri I. Wolf, Antonio Pedro Camargo, Mart Krupovic, Peter Simmonds, Nikos Kyrpides, Uri Gophna, Valerian V. Dolja, Eugene V. Koonin

Summary: We developed a computational pipeline to identify viroid-like cccRNAs and found a 5-fold increase in the number of identified elements compared to previous studies. The presence of viroid-like cccRNAs in diverse transcriptomes and ecosystems suggests that their host range is broader than currently known.
Article Ecology

Plant microbiomes harbor potential to promote nutrient turnover in impoverished substrates of a Brazilian biodiversity hotspot

Antonio P. Camargo, Rafael S. C. de Souza, Juliana Jose, Isabel R. Gerhardt, Ricardo A. Dante, Supratim Mukherjee, Marcel Huntemann, Nikos C. Kyrpides, Marcelo F. Carazzolle, Paulo Arruda

Summary: The grassland ecosystem of Brazilian campos rupestres has low concentrations of phosphorus and nitrogen, yet supports a high plant diversity. This study explores the taxonomic profile and functional potential of microbial communities associated with two plant species of the campos rupestres. The results show that the soil and rock communities associated with these plants share a core group of efficient colonizers enriched in certain bacterial families. The microbial populations associated with plant roots have a genetic repertoire for organic compound intake, phosphorus and nitrogen turnover, highlighting their role in nutrient availability.

ISME JOURNAL (2023)

Article Biochemistry & Molecular Biology

IMG/VR v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata

Antonio Pedro Camargo, Stephen Nayfach, I-Min A. Chen, Krishnaveni Palaniappan, Anna Ratner, Ken Chu, Stephan J. Ritter, T. B. K. Reddy, Supratim Mukherjee, Frederik Schulz, Lee Call, Russell Y. Neches, Tanja Woyke, Natalia N. Ivanova, Emiley A. Eloe-Fadrosh, Nikos C. Kyrpides, Simon Roux

Summary: Viruses play critical roles in all microbiomes and their genomic diversity and impacts on biological processes are extensively explored through metagenomics. IMG/VR is a platform providing access to a large collection of viral sequences along with functional annotation and metadata. The latest version, IMG/VR v4, contains over 15 million virus genomes and genome fragments.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

The IMG/M data management and analysis system v.7: content updates and new features

I-Min A. Chen, Ken Chu, Krishnaveni Palaniappan, Anna Ratner, Jinghua Huang, Marcel Huntemann, Patrick Hajek, Stephan J. Ritter, Cody Webb, Dongying Wu, Neha J. Varghese, T. B. K. Reddy, Supratim Mukherjee, Galina Ovchinnikova, Matt Nolan, Rekha Seshadri, Simon Roux, Axel Visel, Tanja Woyke, Emiley A. Eloe-Fadrosh, Nikos C. Kyrpides, Natalia N. Ivanova

Summary: The Integrated Microbial Genomes & Microbiomes system (IMG/M) at the Department of Energy Joint Genome Institute (JGI) provides support for comparative analysis of various genomes, metagenomes, and metatranscriptomes. It includes datasets from JGI, as well as imported datasets from public sources and user-submitted datasets. In recent years, efforts have been made to improve annotation pipeline, upgrade reference database versions, and add new analysis functionalities.

NUCLEIC ACIDS RESEARCH (2023)

Article Chemistry, Analytical

HyperSCP: Combining Isotopic and Isobaric Labeling for Higher Throughput Single-Cell Proteomics

Yiran Liang, Thy Truong, Aubrianna J. Saxton, Hannah Boekweg, Samuel H. Payne, Pam M. Van Ry, Ryan T. Kelly

Summary: Recent advances in mass spectrometry-based single-cell proteomics have improved sensitivity, but measurement throughput is still limited. To increase throughput, we combined isobaric and isotopic labeling methods for multiplexing. By using SILAC and TMT labeling, we were able to analyze up to 28 single cells in a single LC-MS analysis. With a customized nanowell chip, sample losses were minimized. The measurement throughput could be further increased with a high-duty-cycle multicolumn LC system.

ANALYTICAL CHEMISTRY (2023)

Article Biochemical Research Methods

Toward an Integrated Machine Learning Model of a Proteomics Experiment

Benjamin A. Neely, Viktoria Dorfer, Lennart Martens, Isabell Bludau, Robbin Bouwmeester, Sven Degroeve, Eric W. Deutsch, Siegfried Gessulat, Lukas Kaell, Pawel Palczynski, Samuel H. Payne, Tobias Greisager Rehfeldt, Tobias Schmidt, Veit Schwaemmle, Julian Uszkoreit, Juan Antonio Vizcaino, Mathias Wilhelm, Magnus Palmblad

Summary: In recent years, machine learning has made significant progress in modeling mass spectrometry data for proteomics analysis. A workshop was conducted to evaluate and explore machine learning applications in multidimensional mass spectrometry-based proteomics analysis. The workshop helped identify knowledge gaps, define needs, and discuss the possibilities, challenges, and future opportunities. The summary of the discussions conveys excitement about the potential of machine learning in proteomics and aims to inspire future research.

JOURNAL OF PROTEOME RESEARCH (2023)

Article Biochemical Research Methods

Challenges and Opportunities for Single-cell Computational Proteomics

Hannah Boekweg, Samuel H. Payne

Summary: Single-cell proteomics is growing rapidly, but there is a lack of attention on algorithms for identifying and quantifying proteins. Current algorithms designed for bulk data may not hold true for single-cell data, so it is important to assess their performance and optimize them for single-cell data.

MOLECULAR & CELLULAR PROTEOMICS (2023)

Article Mathematical & Computational Biology

Standardized naming of microbiome samples in Genomes OnLine Database

Supratim Mukherjee, Galina Ovchinnikova, Dimitri Stamatis, Cindy Tianqing Li, I-Min A. Chen, Nikos C. Kyrpides, T. B. K. Reddy

Summary: The power of next-generation sequencing has led to a massive increase in projects aiming to understand the diversity of complex microbial environments. However, the lack of standardized reporting standards for microbiome data and samples poses a challenge for follow-up studies. The Genomes OnLine Database (GOLD) has developed a standardized naming system for microbiome samples to address this issue and has continued to enrich the research community with well-curated and understandable names for metagenomes and metatranscriptomes. This naming system should be adopted as a best practice to improve the interoperability and reusability of microbiome data.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2023)

Article Multidisciplinary Sciences

A proteomic meta-analysis refinement of plasma extracellular vesicles

Milene C. Vallejo, Soumyadeep Sarkar, Emily C. Elliott, Hayden R. Henry, Samantha M. Powell, Ivo Diaz Ludovico, Youngki You, Fei Huang, Samuel H. Payne, Sasanka Ramanadham, Emily K. Sims, Thomas O. Metz, Raghavendra G. Mirmira, Ernesto S. Nakayasu

Summary: Extracellular vesicles (EVs) have important roles in cell-to-cell communication and biomarker studies. In this study, a proteomics meta-analysis was performed to refine the composition of plasma EVs by separating EV proteins and contaminants into different clusters. The refined EV protein list obtained from this study provides a valuable resource for mechanistic and biomarker studies.

SCIENTIFIC DATA (2023)

暂无数据