4.3 Article

Irredundant tandem motifs

期刊

THEORETICAL COMPUTER SCIENCE
卷 525, 期 -, 页码 89-102

出版社

ELSEVIER
DOI: 10.1016/j.tcs.2013.08.012

关键词

Motifs; Tandem; Patterns; Irredundant; Redundant motifs; String algorithm; Suffix tree

资金

  1. Progetto di Ateneo dell'Universita di Padova [CPDA110239]
  2. Italian Ministry for Education, University and Research [PON 01 02477 2007-2013 - FRAME]
  3. Italian Ministry for Education, University and Research under grant FIT (IDEAS: Un Ambiente Integrato per lo Sviluppo di Applicazioni e Soluzioni)

向作者/读者索取更多资源

Eliminating the possible redundancy from a set of candidate motifs occurring in an input string is fundamental in many applications. The existing techniques proposed to extract irredundant motifs are not suitable when the motifs to search for are structured, i.e., they are made of two (or several) subwords that co-occur in a text string s of length n. The main effort of this work is studying and characterizing a compact class of tandem motifs, that is, pairs of substrings < m(1), m(2)> occurring in tandem within a maximum distance of d symbols in s, where d is an integer constant given in input. To this aim, we first introduce the concept of maximality, related to four specific conditions that hold only for this class of motifs. Then, we eliminate the remaining redundancy by defining the notion of irredundancy for tandem motifs. We prove that the number of non-overlapping irredundant tandem motifs is O(d(2)n) which, considering d as a constant, leads to a linear number of tandems in the length of the input string. This is an order of magnitude less than previously developed compact indexes for tandem extraction. The notions and bounds provided for tandem motifs are generalized for the case r >= 2, if r is the number of subwords composing the motifs. Finally, we also provide an algorithm to extract irredundant tandem motifs. (C) 2013 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据