4.5 Article Proceedings Paper

Mining conjunctive sequential patterns

Journal

DATA MINING AND KNOWLEDGE DISCOVERY
Volume 17, Issue 1, Pages 77-93

Publisher

SPRINGER
DOI: 10.1007/s10618-008-0108-z

Keywords

sequential patterns; condensed representation; deduction; non-derivability

Ask authors/readers for more resources

In this paper we aim at extending the non- derivable condensed representation in frequent itemset mining to sequential pattern mining. We start by showing a negative example: in the context of frequent sequences, the notion of nonderivability is meaningless. Therefore, we extend our focus to the mining of conjunctions of sequences. Besides of being of practical importance, this class of patterns has some nice theoretical properties. Based on a new unexploited theoretical definition of equivalence classes for sequential patterns, we are able to extend the notion of a non- derivable itemset to the sequence domain. We present a new depth- first approach to mine non- derivable conjunctive sequential patterns and show its use inmining association rules for sequences. This approach is based on a well known combinatorial theorem: the Mbius inversion. A performance study using both synthetic and real datasets illustrates the efficiency of our mining algorithm. These new introduced patterns have a high- potential for real- life applications, especially for network monitoring and biomedical fields with the ability to get sequential association rules with all the classical statistical metrics such as confidence, conviction, lift etc.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Artificial Intelligence

Effective and efficient location influence mining in location-based social networks

Muhammad Aamir Saleem, Rohit Kumar, Toon Calders, Torben Bach Pedersen

KNOWLEDGE AND INFORMATION SYSTEMS (2019)

Article Management

PROMETHEE is not quadratic: An O(qn log(n)) algorithm

Toon Calders, Dimitri Van Assche

OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE (2018)

Article Computer Science, Artificial Intelligence

A novel hierarchical-based framework for upper bound computation of graph edit distance

Karam Gouda, Mona Arafa, Toon Calders

PATTERN RECOGNITION (2018)

Editorial Material Computer Science, Artificial Intelligence

Introduction to the special issue on discovery science

Michelangelo Ceci, Toon Calders

MACHINE LEARNING (2018)

Article Computer Science, Information Systems

Keeping the Data Lake in Form: Proximity Mining for Pre-Filtering Schema Matching

Ayman Alserafi, Alberto Abello, Oscar Romero, Toon Calders

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2020)

Article Computer Science, Information Systems

Distributed mining of convoys in large scale datasets

Faisal Orakzai, Torben Bach Pedersen, Toon Calders

Summary: The widespread use of mobile devices has generated a large amount of movement data which is being mined to understand collective mobility behaviors of humans, animals, and objects. Convoy pattern is a useful pattern for finding groups moving together. The DCM algorithm proposed in this paper is a scalable and efficient distributed convoy pattern mining algorithm that outperforms existing algorithms.

GEOINFORMATICA (2021)

Article Computer Science, Interdisciplinary Applications

Roman Urdu toxic comment classification

Hafiz Hassaan Saeed, Muhammad Haseeb Ashraf, Faisal Kamiran, Asim Karim, Toon Calders

Summary: This paper addresses the challenge of Roman Urdu toxic comment detection by developing a first-ever large labeled corpus of toxic and non-toxic comments. With the ensemble approach, the best F1-score reaches 86.35%, setting the first-ever benchmark for toxic comment classification in Roman Urdu.

LANGUAGE RESOURCES AND EVALUATION (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Finding Geo-Social Cohorts in Location-Based Social Networks

Muhammad Aamir Saleem, Toon Calders, Torben Bach Pedersen, Panagiotis Karras

Summary: This paper explores how to identify and predict groups of companions through social ties and geo-tagged activity information. The proposed nontrivial algorithm COVER uses an activity-driven pruning criterion to guide its exploration and is shown to outperform brute-force baselines and previous work in terms of efficiency and prediction accuracy regarding groups of companions.

WEB AND BIG DATA, APWEB-WAIM 2021, PT II (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Learning a Fair Distance Function for Situation Testing

Daphne Lenders, Toon Calders

Summary: Situation testing is a method used to prove discrimination by comparing the treatment of similar individuals in the same situation, with the key being finding a suitable distance function to define similarity in the dataset. Recent data-driven equivalents have been proposed, but the challenge lies in determining the appropriate distance function that disregards irrelevant attributes and weighs relevant attributes for classification.

MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I (2021)

Proceedings Paper Business

Incremental Predictive Process Monitoring: The Next Activity Case

Stephen Pauwels, Toon Calders

Summary: Incremental learning strategies are proposed for updating next-activity prediction models for business processes without the need for full retraining, reducing computational resources while maintaining a more consistent and accurate view of the process.

BUSINESS PROCESS MANAGEMENT (BPM 2021) (2021)

Article Computer Science, Information Systems

k/2-hop: Fast Mining of Convoy Patterns With Effective Pruning

Faisal Orakzai, Toon Calders, Torben Bach Pedersen

PROCEEDINGS OF THE VLDB ENDOWMENT (2019)

Article Computer Science, Information Systems

Detecting Anomalies in Hybrid Business Process Logs

Stephen Pauwels, Toon Calders

APPLIED COMPUTING REVIEW (2019)

Proceedings Paper Computer Science, Interdisciplinary Applications

An Anomaly Detection Technique for Business Processes based on Extended Dynamic Bayesian Networks

Stephen Pauwels, Toon Calders

SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING (2019)

Proceedings Paper Computer Science, Information Systems

Cost Model for Pregel on GraphX

Rohit Kumar, Alberto Abello, Toon Calders

ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2017 (2017)

Proceedings Paper Computer Science, Artificial Intelligence

DS-Prox: Dataset Proximity Mining for Governing the Data Lake

Ayman Alserafi, Toon Calders, Alberto Abello, Oscar Romero

SIMILARITY SEARCH AND APPLICATIONS, SISAP 2017 (2017)

No Data Available