4.5 Article

Heterogeneity in DNA Multiple Alignments: Modeling, Inference, and Applications in Motif Finding

期刊

BIOMETRICS
卷 66, 期 3, 页码 694-704

出版社

WILEY-BLACKWELL
DOI: 10.1111/j.1541-0420.2009.01362.x

关键词

Background modeling; Evolutionary conservation; HMM; Motif finding; Nucleotide base composition; Segmentation; Transcription factor binding site

资金

  1. NSF [DMS-0805491]
  2. Direct For Mathematical & Physical Scien [0805491] Funding Source: National Science Foundation
  3. Division Of Mathematical Sciences [0805491] Funding Source: National Science Foundation

向作者/读者索取更多资源

P>Transcription factors bind sequence-specific sites in DNA to regulate gene transcription. Identifying transcription factor binding sites (TFBSs) is an important step for understanding gene regulation. Although sophisticated in modeling TFBSs and their combinatorial patterns, computational methods for TFBS detection and motif finding often make oversimplified homogeneous model assumptions for background sequences. Since nucleotide base composition varies across genomic regions, it is expected to be helpful for motif finding to incorporate the heterogeneity into background modeling. When sequences from multiple species are utilized, variation in evolutionary conservation violates the common assumption of an identical conservation level in multiple alignments. To handle both types of heterogeneity, we propose a generative model in which a segmented Markov chain is used to partition a multiple alignment into regions of homogeneous nucleotide base composition and a hidden Markov model (HMM) is employed to account for different conservation levels. Bayesian inference on the model is developed via Gibbs sampling with dynamic programming recursions. Simulation studies and empirical evidence from biological data sets reveal the dramatic effect of background modeling on motif finding, and demonstrate that the proposed approach is able to achieve substantial improvements over commonly used background models.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据