4.8 Article

Mixing genome annotation methods in a comparative analysis inflates the apparent number of lineage-specific genes

期刊

CURRENT BIOLOGY
卷 32, 期 12, 页码 2632-+

出版社

CELL PRESS
DOI: 10.1016/j.cub.2022.04.085

关键词

-

资金

  1. Howard Hughes Medical Institute
  2. NIH [RO1-GM43987, R01-HG009116]
  3. NSF-Simons Center for the Mathematical and Statistical Analysis of Biology (NSF) [1764269]
  4. NSF-Simons Quantitative Biology PhD Student Fellowship
  5. FAS Division of Science, Research Computing Group at Harvard University
  6. NSF-Simons Center for the Mathematical and Statistical Analysis of Biology (Simons) [594596]

向作者/读者索取更多资源

Comparisons of genomes can help identify lineage-specific genes, but annotation heterogeneity can lead to errors in determining the number of these genes.
Comparisons of genomes of different species are used to identify lineage-specific genes, those genes that appear unique to one species or clade. Lineage-specific genes are often thought to represent genetic novelty that underlies unique adaptations. Identification of these genes depends not only on genome sequences, but also on inferred gene annotations. Comparative analyses typically use available genomes that have been annotated using different methods, increasing the risk that orthologous DNA sequences may be erroneously annotated as a gene in one species but not another, appearing lineage specific as a result. To evaluate the impact of such ???annotation heterogeneity,???we identified four clades of species with sequenced genomes with more than one publicly available gene annotation, allowing us to compare the number of lineage-specific genes inferred when differing annotation methods are used to those resulting when annotation method is uniform across the clade. In these case studies, annotation heterogeneity increases the apparent number of lineage-specific genes by up to 15-fold, suggesting that annotation heterogeneity is a substantial source of potential artifact.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Plant Sciences

Habitat-Associated Life History and Stress-Tolerance Variation in Arabidopsis arenosa

Pierre Baduel, Brian Arnold, Cara M. Weisman, Ben Hunter, Kirsten Bomblies

PLANT PHYSIOLOGY (2016)

Article Multidisciplinary Sciences

Borrowed alleles and convergence in serpentine adaptation

Brian J. Arnold, Brett Lahner, Jeffrey M. DaCosta, Caroline M. Weisman, Jesse D. Hollister, David E. Salt, Kirsten Bomblies, Levi Yant

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2016)

Article Ecology

Pervasive population genomic consequences of genome duplication in Arabidopsis arenosa

Patrick Monnahan, Filip Kolar, Pierre Baduel, Christian Sailer, Jordan Koch, Robert Horvath, Benjamin Laenen, Roswitha Schmickl, Pirita Paajanen, Gabriela Sramkova, Magdalena Bohutinska, Brian Arnold, Caroline M. Weisman, Karol Marhold, Tanja Slotte, Kirsten Bomblies, Levi Yant

NATURE ECOLOGY & EVOLUTION (2019)

Review Biochemistry & Molecular Biology

The Origins and Functions of De Novo Genes: Against All Odds?

Caroline M. Weisman

Summary: This review provides an overview of the origins and molecular functions of de novo genes, and speculates on how they manage to emerge despite opposing odds.

JOURNAL OF MOLECULAR EVOLUTION (2022)

Article Genetics & Heredity

Defining characteristics and conservation of poorly annotated genes in Caenorhabditis elegans using WormCat 2.0

Daniel P. Higgins, Caroline M. Weisman, Dominique S. Lui, Frank A. D'Agostino, Amy K. Walker

Summary: Omics tools provide broad datasets for biological discovery, but the current computational tools for identifying important genes or pathways have biases towards well-described pathways, limiting their utility for poorly annotated genes. We developed WormCat, an annotation and category enrichment tool, which retains genes with no annotation information as a special UNASSIGNED category. We found that the UNASSIGNED gene category enrichment exhibits tissue-specific expression patterns and can include genes with known functions. Some of the UNASSIGNED genes have human orthologs, including those linked to human diseases. A new method called abSENSE suggests that the failure of BLAST to detect homology explains the lineage specificity of many UNASSIGNED genes, indicating a larger subset could be related to human genes. WormCat provides an annotation strategy that allows the association of UNASSIGNED genes with specific phenotypes and known pathways.

GENETICS (2022)

Article Biochemistry & Molecular Biology

Many, but not all, lineage-specific genes can be explained by homology detection failure

Caroline M. Weisman, Andrew W. Murray, Sean R. Eddy

PLOS BIOLOGY (2020)

暂无数据