4.3 Article

Comparing and aggregating partially resolved trees

期刊

THEORETICAL COMPUTER SCIENCE
卷 412, 期 48, 页码 6634-6652

出版社

ELSEVIER
DOI: 10.1016/j.tcs.2011.08.027

关键词

Aggregation; Computational biology; Consensus; Hausdorff distance; Phylogenetic trees; Quartet distance; Triplet distance

资金

  1. National Science Foundation [DEB-0334832, DEB-0829674, CCF-106029]
  2. Division of Computing and Communication Foundations
  3. Direct For Computer & Info Scie & Enginr [1017189] Funding Source: National Science Foundation

向作者/读者索取更多资源

Partially-resolved - that is, non-binary - trees arise frequently in the analysis of species evolution. Non-binary nodes, also called multifurcations, must be treated carefully, since they can be interpreted as reflecting either lack of information or actual evolutionary history. While several distance measures exist for comparing trees, none of them deal explicitly with this dichotomy. Here we introduce two kinds of distance measures between rooted and unrooted partially-resolved phylogenetic trees over the same set of species; the measures address multifurcations directly. For rooted trees, the measures are based on the topologies the input trees induce on triplets; that is, on three-element subsets of the set of species. For unrooted trees, the measures are based on quartets (four-element subsets). The first class of measures are parametric distances, where there is a parameter that weighs the difference between an unresolved triplet/quartet topology and a resolved one. The second class of measures are based on the Hausdorff distance, where each tree is viewed as a set of all possible ways in which the tree can be refined to eliminate unresolved nodes. We give efficient algorithms for computing parametric distances and give conditions under which Hausdorff distances can be calculated approximately in polynomial time. Additionally, we (i) derive the expected value of the parametric distance between two random trees, (ii) characterize the conditions under which parametric distances are near-metrics or metrics, (iii) study the computational and algorithmic properties of consensus tree methods based on the measures, and (iv) analyze the interrelationships among Hausdorff and parametric distances. (C) 2011 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据