Article
Computer Science, Artificial Intelligence
Giulia Preti, Gianmarco De Francisci Morales, Matteo Riondato
Summary: We propose a sampling-based randomized algorithm called MANIACS for computing high-quality approximations of frequent subgraph patterns in large vertex-labeled graphs. MANIACS provides strong probabilistic guarantees by using the empirical VC dimension and probabilistic tail bounds. It leverages the MNI frequency properties to aggressively prune the pattern search space, resulting in faster exploration of subspaces without frequent patterns. Experimental evaluation shows that MANIACS returns high-quality collections of frequent patterns in large graphs up to two orders of magnitude faster than the exact algorithm.
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY
(2023)
Article
Computer Science, Information Systems
Shafiul Alom Ahmed, Bhabesh Nath
Summary: The paper introduces an approach to pattern mining called Improved Frequent Pattern Growth, which constructs an Improved FP-tree data structure and introduces a layout of Conditional FP-tree for efficient generation of frequent patterns. The experimental results highlight the significance of the proposed Improved FP-Growth algorithm over traditional frequent itemset mining algorithms.
INFORMATION SCIENCES
(2021)
Article
Computer Science, Information Systems
Xiaojie Zhang, Yanlin Qi, Guoting Chen, Wensheng Gan, Philippe Fournier-Viger
Summary: Frequent pattern mining has a wide range of applications and this paper proposes two algorithms for mining fuzzy-driven periodic frequent patterns and finding stable fuzzy-driven periodic frequent patterns. The efficiency of the mining process is improved by using fuzzy sets, novel pruning strategies, and an estimated period co-occurrence structure.
INFORMATION SCIENCES
(2022)
Article
Computer Science, Artificial Intelligence
Yaling Xun, Xiaohui Cui, Jifu Zhang, Qingxia Yin
Summary: The article introduces an incremental frequent itemsets mining algorithm based on multi-scale theory called FPMSIM, which constructs a pattern tree using the classic FP-Growth to improve mining efficiency and reduce I/O costs.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Computer Science, Artificial Intelligence
Md Ashraful Islam, Mahfuzur Rahman Rafi, Al-amin Azad, Jesan Ahammed Ovi
Summary: Data mining is the study of extracting useful information from massive amounts of data, with sequential pattern mining being a major branch. Weighted sequential pattern mining is more feasible in today's datasets due to items having different importance in real-life scenarios. This research introduces a new pruning technique and framework to generate a small number of candidate sequences faster without compromising completeness, significantly outperforming other existing approaches.
APPLIED INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Razieh Davashi
Summary: In this paper, a fast method called ITUFP is proposed for interactive mining of Top-K UFPs. The method efficiently stores and extracts pattern information by creating UP-Lists and IMCUP-Lists, and only updates the IMCUP-Lists when the K value changes. Experimental results demonstrate that the proposed method is very efficient for interactive mining of Top-K UFPs.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Biochemistry & Molecular Biology
Jaime A. Castro-Mondragon, Miriam Ragle Aure, Ole Christian Lingjaerde, Anita Langerod, John W. M. Martens, Anne-Lise Borresen-Dale, Vessela N. Kristensen, Anthony Mathelier
Summary: Most cancer alterations occur in the noncoding portion of the genome, where regulatory regions control gene expression. This study shows that transcription factor binding sites (TFBSs) have similar mutation loads to protein-coding exons. By analyzing cancer somatic mutations in TFBSs and gene expression data, the combined effects of transcriptional and post-transcriptional alterations on regulatory programs in cancer can be evaluated.
NUCLEIC ACIDS RESEARCH
(2022)
Article
Automation & Control Systems
Razieh Davashi
Summary: This study proposes an efficient method based on an upper bound approach to mine uncertain frequent patterns, reducing false positives significantly by tightening the upper bound of expected support and early pruning of infrequent 2-itemsets and their supersets.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2021)
Article
Automation & Control Systems
Laszlo Bantay, Janos Abonyi
Summary: This article proposes a method based on frequent pattern mining for log file partitioning to explore parallel processes. By identifying event groups and overlapping sub-processes, more compact and interpretable process models can be obtained. The method has been validated in the analysis of process alarms in an industrial plant, and it is recommended to be applied in cases where there is no clear start and end of the logged events.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Article
Chemistry, Analytical
Shiting Ding, Zhiheng Li, Kai Zhang, Feng Mao
Summary: This study selects representative sequential pattern mining algorithms and evaluates their performance on taxi trajectory data. The results demonstrate that contiguous constraint-based algorithms show good performance in terms of balanced RAM consumption and execution time.
Article
Computer Science, Information Systems
Penugonda Ravikumar, Palla Likhitha, Bathala Venus Vikranth Raj, Rage Uday Kiran, Yutaka Watanobe, Koji Zettsu
Summary: Discovering periodic-frequent patterns in temporal databases is challenging, with most algorithms using horizontal database layout leading to inefficiencies. Vertical database layout is important as real-world big data is often stored this way. The proposed PF-ECLAT algorithm demonstrates memory and runtime efficiency, scalability, and usefulness in case studies analyzing air pollution and traffic congestion.
Article
Computer Science, Artificial Intelligence
Natalia Mordvanyuk, Beatriz Lopez, Albert Bifet
Summary: This study introduces a new algorithm vertTIRP for mining Time-Interval-Related Patterns (TIRP), which efficiently manages patterns using temporal transitivity, sorts temporal relations to speed up mining, and eliminates ambiguities in temporal relations. Experimental evaluation shows vertTIRP requires significantly less computation time and is an effective approach compared to other algorithms.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Computer Science, Information Systems
Palla Likhitha, Penugonda Ravikumar, Deepika Saxena, Rage Uday Kiran, Yutaka Watanobe
Summary: Finding periodic-frequent patterns in temporal databases is a significant data mining problem. This paper proposes a solution to discover the top-k periodic-frequent patterns in a database.
Article
Computer Science, Artificial Intelligence
Ham Nguyen, Nguyen Le, Huong Bui, Tuong Le
Summary: Mining patterns that satisfy both frequency and utility constraints is an important problem in data mining. There are currently two main approaches: one considers these factors separately, and the other combines them using a composite measure. This study proposes a new structure and algorithm to effectively mine frequent weighted utility patterns in quantitative databases, and the experimental results demonstrate its superiority over existing methods.
APPLIED INTELLIGENCE
(2023)
Article
Chemistry, Multidisciplinary
Miguel Nunez-del-Prado, Yoshitomi Maehara-Aliaga, Julian Salas, Hugo Alatrista-Salas, David Megias
Summary: This paper proposes a differential privacy graph-based technique for publishing frequent sequential patterns, which can protect these patterns without accessing all users' sequences. The utility of this technique as a pattern mining algorithm is assessed, along with its impact on a recommender system. A comparison with the DP-FSM algorithm is also performed.
APPLIED SCIENCES-BASEL
(2022)
Editorial Material
Biochemical Research Methods
Susanne Hollmann, Marcus Frohme, Christoph Endrullat, Andreas Kremer, Domenica D'Elia, Babette Regierer, Alina Nechyporenko
PLOS COMPUTATIONAL BIOLOGY
(2020)
Article
Microbiology
Isabel Moreno-Indias, Leo Lahti, Miroslava Nedyalkova, Ilze Elbere, Gennady Roshchupkin, Muhamed Adilovic, Onder Aydemir, Burcu Bakir-Gungor, Enrique Carrillo-de Santa Pau, Domenica D'Elia, Mahesh S. Desai, Laurent Falquet, Aycan Gundogdu, Karel Hron, Thomas Klammsteiner, Marta B. Lopes, Laura Judith Marcos-Zambrano, Claudia Marques, Michael Mason, Patrick May, Lejla Pasic, Gianvito Pio, Sandor Pongor, Vasilis J. Promponas, Piotr Przymus, Julio Saez-Rodriguez, Alexia Sampri, Rajesh Shigdel, Blaz Stres, Ramona Suharoschi, Jaak Truu, Ciprian-Octavian Truica, Baiba Vilne, Dimitrios Vlachakis, Ercument Yilmaz, Georg Zeller, Aldert L. Zomer, David Gomez-Cabrero, Marcus J. Claesson
Summary: The study of the human microbiome presents challenges in dealing with the heterogeneity of data and the variation in microbiome composition. New techniques are required to address emerging applications and the vast heterogeneity of microbiome data.
FRONTIERS IN MICROBIOLOGY
(2021)
Article
Computer Science, Artificial Intelligence
Annalisa Appice, Angelo Cannarile, Antonella Falini, Donato Malerba, Francesca Mazzia, Cristiano Tamborrino
Summary: Saliency detection in hyperspectral imaging, mimicking the natural visual attention mechanism, has room for improvement despite existing models. An ensemble learning methodology leveraging spectral information from multiple images shows promising results in enhancing saliency detection performance.
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
(2021)
Article
Computer Science, Information Systems
Vincenzo Pasquadibisceglie, Annalisa Appice, Giovanna Castellano, Donato Malerba
Summary: Predictive business process monitoring is an online approach that predicts the unfolding of running traces based on historical event logs. This article proposes a novel predictive process method that combines multi-view learning and deep learning to improve predictive accuracy by considering various information recorded in event logs.
IEEE TRANSACTIONS ON SERVICES COMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Giuseppina Andresini, Annalisa Appice, Dino Ienco, Donato Malerba
Summary: This paper proposes a method called SENECA, which is based on a CD Siamese network and uses transfer learning and active learning to handle the constraint of limited supervision in order to learn an accurate CD model with limited labelled data. The experimental results demonstrate the significant benefits of the proposed method in improving the accuracy of CD decisions.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Computer Science, Artificial Intelligence
Giuseppina Andresini, Annalisa Appice, Francesco Paolo Caforio, Donato Malerba, Gennaro Vessio
Summary: Network Intrusion Detection (NID) systems are crucial for network protection, but existing deep learning methods are too complex to interpret. In this paper, a new neural model called ROULETTE is proposed, which combines attention mechanism and multi-output deep learning strategy for accurate and explainable classification of network traffic data. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed method in terms of accuracy and explainability.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Automation & Control Systems
Vincenzo Pasquadibisceglie, Annalisa Appice, Giovanna Castellano, Donato Malerba
Summary: Predictive process monitoring (PPM) is a task in Process Mining that aims to predict factors of a business process based on historical event logs. However, existing PPM algorithms assume steady-state processes, which is not the case in the real world due to concept drifts. This work proposes DARWIN, a PPM method that detects and adapts to concept drifts in business data streams, and provides empirical analysis and experiments to showcase its effectiveness.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Review
Microbiology
Eliana Ibrahimi, Marta B. Lopes, Xhilda Dhamo, Andrea Simeon, Rajesh Shigdel, Karel Hron, Blaz Stres, Domenica D'Elia, Magali Berland, Laura Judith Marcos-Zambrano
Summary: This mini review examines the preprocessing and transformation methods used in recent human microbiome studies, highlighting the limited adoption of statistical transformation methods specifically targeting microbiome sequencing data characteristics. Instead, relative and normalization-based transformations are commonly used without considering the specific attributes of microbiome data. The lack of information on preprocessing and transformations applied to the data raises concerns about reproducibility, comparability, and reliability of results.
FRONTIERS IN MICROBIOLOGY
(2023)
Article
Microbiology
Domenica D'Elia, Jaak Truu, Leo Lahti, Magali Berland, Georgios Papoutsoglou, Michelangelo Ceci, Aldert Zomer, Marta B. Lopes, Eliana Ibrahimi, Aleksandra Gruca, Alina Nechyporenko, Marcus Frohme, Thomas Klammsteiner, Enrique Carrillo-de Santa Pau, Laura Judith Marcos-Zambrano, Karel Hron, Gianvito Pio, Andrea Simeon, Ramona Suharoschi, Isabel Moreno-Indias, Andriy Temko, Miroslava Nedyalkova, Elena-Simona Apostol, Ciprian-Octavian Truica, Rajesh Shigdel, Jasminka Hasic Telalovic, Erik Bongcam-Rudloff, Piotr Przymus, Naida Babic Jordamovic, Laurent Falquet, Sonia Tarazona, Alexia Sampri, Gaetano Isola, David Perez-Serrano, Vladimir Trajkovik, Lubos Klucar, Tatjana Loncar-Turukalo, Aki S. Havulinna, Christian Jansen, Randi J. Bertelsen, Marcus Joakim Claesson
Summary: The rapid development of machine learning techniques has opened up new applications in the field of microbiome research, improving healthcare practices in the era of precision medicine. ML4Microbiome is a European network that aims to promote collaboration between microbiome researchers and ML experts to optimize and standardize ML approaches for microbiome analysis.
FRONTIERS IN MICROBIOLOGY
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Paolo Mignone, Donato Malerba, Michelangelo Ceci
Summary: In this paper, anomaly detection was performed for air pollution and public transport traffic analysis in Oslo, Norway using the SparkGHSOM method. The detected anomalies were explained through an instance-based feature ranking approach. The results showed successful anomaly detection for both applications.
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Malik AL-Essa, Giuseppina Andresini, Annalisa Appice, Donato Malerba
Summary: This paper explores the effectiveness of adversarial training in cybersecurity and uses XAI technique to analyze the impact of specific input features on decision-making, providing better insight into feature robustness for security analysts. It also investigates the use of XAI for robust feature selection in cybersecurity problems.
FOUNDATIONS OF INTELLIGENT SYSTEMS (ISMIS 2022)
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Giuseppina Andresini, Annalisa Appice, Domenico Dell'Olio, Donato Malerba
Summary: This study utilizes machine learning to monitor land cover changes using Sentinel-2 images. The proposed method involves a Siamese network and transfer learning, with unsupervised estimation of change pseudo-labels in new scenes. The results demonstrate the effectiveness of the approach in detecting land cover changes in Sentinel-2 images acquired at different times.
AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Giuseppina Andresini, Annalisa Appice, Corrado Loglisci, Vincenzo Belvedere, Domenico Redavid, Donato Malerba
Summary: Researchers have proposed a method to combat concept drift in network traffic data by updating deep neural network architectures to fit drifted data, leading to higher predictive accuracy in intrusion detection.
DISCOVERY SCIENCE (DS 2021)
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Francesco Paolo Caforio, Giuseppina Andresini, Gennaro Vessio, Annalisa Appice, Donato Malerba
Summary: This paper proposes a method to make the visual explanations of deep learning-based intrusion detection models more transparent and accurate, addressing issues related to network cyber attacks. The method demonstrates effectiveness on a CNN trained on a 2D representation of historical network traffic data.
DISCOVERY SCIENCE (DS 2021)
(2021)