4.7 Article

Frequent pattern discovery with tri-partition alphabets

Journal

INFORMATION SCIENCES
Volume 507, Issue -, Pages 715-732

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2018.04.013

Keywords

Pattern discovery; Sequence; Tri-partition; Tri-pattern

Funding

  1. National Science Foundation of China [61379089, 41604114]
  2. Open Research Fund of Sichuan Key Laboratory for Nature Gas and Geology [2015trqdz04]

Ask authors/readers for more resources

The concept of patterns is the basis of sequence analysis. There are various pattern definitions for biological data, texts, and time series. Inspired by the methodology of three-way decisions and protein tri-partition, this paper proposes a frequent pattern discovery algorithm for a new type of pattern by dividing the alphabet into strong, medium, and weak parts. The new type, called a tri-pattern, is more general and flexible than existing ones and is therefore more interesting in applications. Experiments were undertaken on data in various fields to reveal the universality of this new pattern. These include protein sequence mining, petroleum production time series analysis, and forged Chinese text keyword mining. The results show that tri-patterns are more meaningful and desirable than the existing four types of patterns. This study enriches the semantics of sequential pattern discovery and the application fields of three-way decisions. (C) 2018 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available