4.7 Article

Species Tree Inference from Gene Splits by Unrooted STAR Methods

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2016.2604812

关键词

Coalescent model; STAR algorithm; NJ(st); species tree

资金

  1. National Science Foundation
  2. U.S. Department of Homeland Security
  3. U.S. Department of Agriculture through NSF [EF-0832858]
  4. University of Tennessee, Knoxville
  5. National Institutes of Health under Joint DMS/NIGMS Initiative to Support Research at the Interface of the Biological and Mathematical Sciences [R01 GM117590]
  6. Direct For Biological Sciences
  7. Div Of Biological Infrastructure [1300426] Funding Source: National Science Foundation

向作者/读者索取更多资源

The NJ(st) method was proposed by Liu and Yu to infer a species tree topology from unrooted topological gene trees. While its statistical consistency under the multispecies coalescent model was established only for a four-taxon tree, simulations demonstrated its good performance on gene trees inferred from sequences for many taxa. Here, we prove the statistical consistency of the method for an arbitrarily large species tree. Our approach connects NJ(st) to a generalization of the STAR method of Liu, Pearl, and Edwards, and a previous theoretical analysis of it. We further show NJ(st) utilizes only the distribution of splits in the gene trees, and not their individual topologies. Finally, we discuss how multiple samples per taxon per gene should be handled for statistical consistency.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

NANUQ: a method for inferring species networks from gene trees under the coalescent model

Elizabeth S. Allman, Hector Banos, John A. Rhodes

ALGORITHMS FOR MOLECULAR BIOLOGY (2019)

Article Biology

Inferring Metric Trees from Weighted Quartets via an Intertaxon Distance

Samaneh Yourdkhani, John A. Rhodes

BULLETIN OF MATHEMATICAL BIOLOGY (2020)

Article Biochemical Research Methods

MSCquartets 1.0: quartet methods for species trees and networks under the multispecies coalescent model in R

John A. Rhodes, Hector Banos, Jonathan D. Mitchell, Elizabeth S. Allman

Summary: MSCquartets is an R package for species tree hypothesis testing, inference of species trees, and inference of species networks. It takes collections of metric or topological locus trees as input, summarizes them using quartets, and displays hypothesis test results in a simplex plot. The package implements algorithms for topological and metric species tree inference, as well as level-1 topological species network inference.

BIOINFORMATICS (2021)

Article Biology

Identifiability of species network topologies from genomic sequences using the logDet distance

Elizabeth S. Allman, Hector Banos, John A. Rhodes

Summary: Inference of network-like evolutionary relationships between species from genomic data must consider both gene flow and incomplete lineage sorting. Standard methods have high computational demands and limit the size of analyzed datasets. This study shows that logDet distances computed from genomic-scale sequences can efficiently recover network relationships in the level-1 ultrametric case. It applies to both unlinked site data and sequence data.

JOURNAL OF MATHEMATICAL BIOLOGY (2022)

Article Biology

The tree of blobs of a species network: identifiability under the coalescent

Elizabeth S. Allman, Hector Banos, Jonathan D. Mitchell, John A. Rhodes

Summary: Inference of species networks under the Network Multispecies Coalescent Model is limited by computational demands and the complexity of the networks. This study focuses on the tree of blobs, where non-cut edges are contracted to nodes, to infer a general species network. An identifiability theorem is established, stating that most features of the unrooted tree of blobs can be determined from the distribution of gene quartet topologies. This suggests a practical algorithm for tree of blobs inference.

JOURNAL OF MATHEMATICAL BIOLOGY (2023)

Article Biochemical Research Methods

Testing Multispecies Coalescent Simulators Using Summary Statistics

Elizabeth S. Allman, Hector Banos, John A. Rhodes

Summary: As more genomic-scale datasets are being used for species tree inference, simulators of the multispecies coalescent (MSC) process are necessary to test and evaluate new inference methods. However, the simulators themselves need to be tested to ensure their validity. This study develops methods to check if a collection of gene trees aligns with the MSC model on a given species tree. The tests conducted on well-known simulators reveal flaws in some of the samples, and are implemented in the freely available R package MSCsimtester for easy application by developers and users.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Biochemical Research Methods

Parameter Identifiability of a Multitype Pure-Birth Model of Speciation

Dakota Dragomir, Elizabeth S. Allman, John A. Rhodes

Summary: Diversification models describe the random growth of evolutionary trees to model the historical relationships of species. This study establishes the identifiability of parameters for one form of such a model, a multitype pure birth model of speciation, based on an asymptotic distribution derived from a single tree observation. The key finding is that type observations are not needed at any internal points or leaves of the tree for practical applications.

JOURNAL OF COMPUTATIONAL BIOLOGY (2023)

Article Biochemical Research Methods

Parameter Identifiability for a Profile Mixture Model of Protein Evolution

Samaneh Yourdkhani, Elizabeth S. Allman, John A. Rhodes

Summary: The PM model for protein evolution describes sequence data with sites following multiple related substitution processes depending on different amino acid distributions. Using algebraic methods, parameters in the PM model are shown to be identifiable for empirical analyses, particularly when the tree relates 9 or more taxa and the number of profiles is less than 74.

JOURNAL OF COMPUTATIONAL BIOLOGY (2021)

Article Biochemical Research Methods

Topological Metrizations of Trees, and New Quartet Methods of Tree Inference

John A. Rhodes

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2020)

Article Statistics & Probability

Hypothesis testing near singularities and boundaries

Jonathan D. Mitchell, Elizabeth S. Allman, John A. Rhodes

ELECTRONIC JOURNAL OF STATISTICS (2019)

Article Mathematics, Applied

Species Tree Inference from Genomic Sequences Using the Log-Det Distance

Elizabeth S. Allman, Colby Long, John A. Rhodes

SIAM JOURNAL ON APPLIED ALGEBRA AND GEOMETRY (2019)

暂无数据