Journal
IEEE ACCESS
Volume 5, Issue -, Pages 3111-3120Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2017.2676803
Keywords
Data mining; pattern mining; erasable pattern; erasable closed pattern
Categories
Funding
- Foundation for Science and Technology Development of Ton Duc Thang University (FOSTECT) [FOSTECT.2015.BR.01]
Ask authors/readers for more resources
Finding knowledge from large data sets to use in intelligent systems becomes more and more important in the Internet era. Pattern mining, classification, text mining, and opinion mining are the topical issues. Among them, pattern mining is an important issue. The problem of mining erasable patterns (EPs) has been proposed as a variant of frequent pattern mining for optimizing the production plans of factories. Several algorithms have been proposed for effectively mining EPs. However, for large threshold values, many EPs are obtained, leading to large memory usage. Therefore, it is necessary to mine a condensed representation of EPs. This paper first defines erasable closed patterns (ECPs), which can represent the set of EPs without information loss. Then, a theorem for fast determining ECPs based on dPidset structure is proposed and proven. Next, two efficient algorithms [erasable closed pattern mining (ECPat) and dNC_Set based algorithm for erasable closed pattern mining (dNC-ECPM)] for mining ECPs based on this theorem are proposed. Experimental results show that ECPat is the best method for sparse data sets, while dNCECPM algorithm outperforms ECPat algorithm and a modified mining erasable itemsets algorithm in terms of the mining time and memory usage for all remaining data sets.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available