Article
Computer Science, Artificial Intelligence
Pengyun Zhu, Chaowei Zhang, Xiaofeng Li, Jifu Zhang, Xiao Qin
Summary: Traditional outlier detection methods are not suitable for high-dimensional data analysis due to the curse of dimensionality. Inspired by Coulomb's law, a new similarity measure vector is proposed for high-dimensional data, which incorporates outlier Coulomb force and outlier Coulomb resultant force. The algorithm effectively measures similarity and differences among data objects, and provides interpretable results with the Coulomb resultant force. The algorithm is evaluated using UCI and synthetic datasets, demonstrating its effectiveness in alleviating the curse of dimensionality and producing interpretable high-dimensional outlier data.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Review
Computer Science, Information Systems
Imen Souiden, Mohamed Nazih Omri, Zaki Brahmi
Summary: The rapid evolution of technology has generated high-dimensional data streams in various fields, posing challenges for outlier detection. This study aims to examine existing approaches, identify comparison criteria, and highlight the challenges and research directions associated with this problem.
COMPUTER SCIENCE REVIEW
(2022)
Article
Computer Science, Artificial Intelligence
Yun Yang, ChongJun Fan, Liang Chen, HongLin Xiong
Summary: The study introduces an IPMOD model to enhance the accuracy and real-time outlier detection in high-dimensional medical data streams using information entropy and pruning techniques.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Liang Chen, Wei Wang, Yun Yang
Summary: This paper introduces a new algorithm called CELOF for real-time outlier detection on data streams, which effectively overcomes two main limitations of traditional algorithms. Experimental results show that the CELOF algorithm has an average improvement of 15% in accuracy and runs in less than 1% of the time of the original LOF, making it widely applicable in various practical scenarios.
APPLIED SOFT COMPUTING
(2021)
Article
Computer Science, Artificial Intelligence
Abhaya Abhaya, Bidyut Kr Patra
Summary: The paper proposes a hybrid approach named RDPOD, which efficiently utilizes distance-based and density-based clustering methods to correctly identify the density of each point. Analysis of experimental results shows that our proposed approach outperforms other popular techniques in detecting outlier points.
NEURAL COMPUTING & APPLICATIONS
(2022)
Article
Computer Science, Information Systems
Maria D'Errico, Elena Facco, Alessandro Laio, Alex Rodriguez
Summary: As the capability to generate data increases rapidly, the challenge lies in extracting human-readable and useful information from data sets with high dimensions. Mapping data onto a two or three-dimensional surface is a possible approach to achieve this goal.
INFORMATION SCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Moritz Herrmann, Florian Pfisterer, Fabian Scheipl
Summary: Outlier or anomaly detection is a crucial task in data analysis. This paper discusses the problem from a geometrical perspective and proposes a framework that utilizes the metric structure of a dataset. The authors leverage the manifold assumption and show that exploiting this structure significantly enhances the detection of outliers in high-dimensional data. They also introduce a novel and precise distinction between distributional and structural outliers based on the geometry and topology of the data manifold. The experiments demonstrate the effectiveness of manifold learning methods in detecting and visualizing outliers in high-dimensional and non-tabular data.
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY
(2023)
Article
Computer Science, Artificial Intelligence
Sayyed Ahmad Naghavi Nozad, Maryam Amir Haeri, Gianluigi Folino
Summary: This paper presents a batch-wise density-based clustering approach for local outlier detection in massive-scale datasets. The method is scalable and processes input data chunk-by-chunk within a limited memory buffer, updating a temporary clustering model gradually to obtain the approximate structure of original clusters and assigning an outlying score to each object. Evaluation shows the proposed method has low linear time complexity compared to conventional methods loading all data into memory and fast distance-based methods operating on disk-resident data.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Bahar Ali, Nouman Azam, Anwar Shah, JingTao Yao
Summary: Three-way clustering is effective for handling uncertain, imprecise, and incomplete data, utilizing reduction and elevation operations to create core and support clusters. Experimental results show that RE3WC can detect additional outliers compared to other clustering algorithms, resulting in more compact and precise clusters. Additionally, RE3WC yields comparable results to notable approaches such as LOF, LoOP, ABOD, and IF.
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
(2021)
Article
Computer Science, Information Systems
Cheong Hee Park
Summary: This paper proposes a two-step approach for MPU learning on high dimensional data. In the first step, negative samples are selected using an ensemble of k-nearest neighbors-based outlier detection models in a low dimensional space. In the second step, the linear discriminant function is optimized on the selected positive data and negative samples. Experimental results demonstrate the high performance of the proposed MPU learning method.
Article
Computer Science, Artificial Intelligence
Jiawei Yang, Susanto Rahardja, Pasi Franti
Summary: The mean-shift outlier detector modifies data using mean-shift technique to eliminate the bias caused by outliers and remove their influence without needing to know the outliers. Experimental results show that this method performs well regardless of the number of outliers in the data.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Guanghua Zhao, Tao Yang, Dongmei Fu
Summary: Manifold learning plays an increasingly important role in machine learning, but its dimensionality reduction effect is reduced by inevitable noises and outliers that destroy the manifold structure of data. Therefore, this paper proposes a denoising algorithm based on manifold learning for high-dimensional data. The algorithm first projects noisy sample vectors onto the local manifold to achieve noise reduction. Then, statistical analysis of noises is performed to obtain a data boundary. Outliers, which are sample vectors outside the data boundary, are marked and eliminated. Finally, dimension reduction is performed on the data after noise reduction and outlier detection. Experimental results show that the algorithm can effectively eliminate the interference of noises and outliers in high-dimensional datasets to some extent for manifold learning.
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
(2023)
Article
Statistics & Probability
Priyanga Dilini Talagala, Rob J. Hyndman, Kate Smith-Miles
Summary: The article introduces an algorithm for detecting anomalies in high-dimensional data, addressing limitations of the HDoutliers algorithm to improve performance, and demonstrates its wide applicability on various datasets.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2021)
Article
Computer Science, Theory & Methods
Aymen Abid, Salim El Khediri, Abdennaceur Kachouri
Summary: This article compares two density-based clustering methods, DBSCAN and OPTICS, for identifying outliers and normal clusters. The results suggest that the DBSCAN scheme is more accurate and comprehensive for WSNs, while OPTICS remains a suitable solution for hierarchical study of datasets.
Article
Computer Science, Artificial Intelligence
Maximilian B. Toller, Bernhard C. Geiger, Roman Kern
Summary: Rate-distortion theory-based outlier detection utilizes good data compression to encode outliers with unique symbols. We propose Cluster Purging as an extension of clustering-based outlier detection, allowing the assessment of clustering representivity and the identification of data best represented by individual unique clusters. We present two efficient algorithms for Cluster Purging, one parameter-free and the other allowing tuning in supervised setups.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Environmental Sciences
Maria-Viola Martikainen, Paeivi Aakko-Saksa, Lenie van den Broek, Flemming R. Cassee, Roxana O. Carare, Sweelin Chew, Andras Dinnyes, Rosalba Giugno, Katja M. Kanninen, Tarja Malm, Ala Muala, Maiken Nedergaard, Anna Oudin, Pedro Oyola, Tobias V. Pfeiffer, Topi Ronkko, Sanna Saarikoski, Thomas Sandstrom, Roel P. F. Schins, Jan Topinka, Mo Yang, Xiaowen Zeng, Remco H. S. Westerink, Pasi I. Jalava
Summary: The adverse effects of air pollutants on the respiratory and cardiovascular systems are well-known, but recent studies have found that they also have negative effects on the neurological system and cognitive function. Ultrafine particles (UFPs) play a key role in these effects, but there is still limited understanding about the smallest fraction and semivolatile compounds. The TUBE project aims to increase knowledge about harmful UFPs and semivolatile compounds, provide information for better emission legislation, and assess the impact of air pollution on the brain and its removal.
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH
(2022)
Article
Biochemical Research Methods
Manuel Tognon, Vincenzo Bonnici, Erik Garrison, Rosalba Giugno, Luca Pinello
Summary: GRAFIMO is a command-line tool for scanning known TF DNA motifs in VGs, extending the standard PWM scanning procedure by considering variations and alternative haplotypes encoded in a VG, recovering additional potential binding sites than scanning only the reference genome.
PLOS COMPUTATIONAL BIOLOGY
(2021)
Article
Cell Biology
Riikka Lampinen, Mohammad Feroze Fazaludeen, Simone Avesani, Tiit Ord, Elina Penttila, Juha-Matti Lehtola, Toni Saari, Sanna Hannonen, Liudmila Saveleva, Emma Kaartinen, Francisco Fernandez Acosta, Marcela Cruz-Haces, Heikki Lopponen, Alan Mackay-Sim, Minna U. Kaikkonen, Anne M. Koivisto, Tarja Malm, Anthony R. White, Rosalba Giugno, Sweelin Chew, Katja M. Kanninen
Summary: This study evaluated the differences in olfactory mucosa between cognitively healthy individuals and Alzheimer's disease patients. The findings showed increased secretion of amyloid-beta in Alzheimer's disease olfactory mucosal cells and identified 240 differentially expressed disease-associated genes and five distinct cell populations. The study also revealed alterations in RNA and protein metabolism, inflammatory processes, and signal transduction in multiple cell populations, suggesting their involvement in Alzheimer's disease-related olfactory mucosa pathophysiology. Additionally, the study proposed alterations in gene expression of mitochondrially located genes in AD OM cells, which were verified by functional assays, demonstrating altered mitochondrial respiration and a reduction of ATP production. The results highlight the changes in olfactory mucosal cells in Alzheimer's disease and demonstrate the significance of single-cell RNA sequencing data in investigating the molecular and cellular mechanisms associated with the disease.
Article
Biochemical Research Methods
Vincenzo Bonnici, Rosalba Giugno
Summary: PANPROVA is a benchmark tool that simulates prokaryotic pangenomic evolution by evolving the complete genomic sequence of an ancestral isolate. It enables operation in the pre-assembly phase and includes evolutionary features such as gene set variations, sequence variations, and horizontal acquisition from a pool of external genomes.
Article
Biochemistry & Molecular Biology
Riikka Lampinen, Veronika Gorova, Simone Avesani, Jeffrey R. Liddell, Elina Penttila, Tana Zavodna, Zdenek Krejcik, Juha-Matti Lehtola, Toni Saari, Juho Kalapudas, Sanna Hannonen, Heikki Lopponen, Jan Topinka, Anne M. Koivisto, Anthony R. White, Rosalba Giugno, Katja M. Kanninen
Summary: The biometal homeostasis in the olfactory mucosa cells of Alzheimer's disease (AD) patients is disturbed and correlated with the alterations in the brain. This provides new clues for the early diagnosis and treatment of AD.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2022)
Article
Oncology
Michele Simbolo, Giovanni Centonze, Luca Giudice, Federica Grillo, Patrick Maisonneuve, Anastasios Gkountakos, Chiara Ciaparrone, Laura Cattaneo, Giovanna Sabella, Rosalba Giugno, Paola Bossi, Paola Spaggiari, Alessandro Del Gobbo, Stefano Ferrero, Luca Mastracci, Alessandra Fabbri, Martina Filugelli, Giovanna Garzone, Natalie Prinzi, Sara Pusceddu, Adele Testi, Valentina Monti, Luigi Rolli, Alessandro Mangogna, Luisa Bercich, Mauro Roberto Benvenuti, Emilio Bria, Sara Pilotto, Alfredo Berruti, Ugo Pastorino, Carlo Capella, Maurizio Infante, Michele Milella, Aldo Scarpa, Massimo Milione
Summary: This study provides an integrated molecular analysis of 44 combined large cell neuroendocrine carcinomas (CoLCNECs), revealing that CoLCNECs are an independent histologic category with specific genomic and transcriptomic features that are different from other lung cancers. The findings of this study contribute to a better understanding of these rare tumors and may lead to the development of new diagnostic approaches for personalized treatments in CoLCNECs.
Article
Biochemistry & Molecular Biology
Maninder Heer, Luca Giudice, Claudia Mengoni, Rosalba Giugno, Daniel Rico
Summary: Researchers have developed a new method called Esearch3D to identify active enhancers using network theory approaches. This method calculates the likelihood of enhancer activity in intergenic regions by analyzing the folding of chromatin in the three-dimensional nuclear space, and regions predicted to have high enhancer activity are shown to be enriched in annotations indicative of enhancer activity.
NUCLEIC ACIDS RESEARCH
(2023)
Article
Multidisciplinary Sciences
Naomi I. Maria, Rosaria Valentina Rapicavoli, Salvatore Alaimo, Evelyne Bischof, Alessia Stasuzzo, Jantine A. C. Broek, Alfredo Pulvirenti, Bud Mishra, Ashley J. Duits, Alfredo Ferro, RxCOVEA Framework
Summary: The current pandemic has created an urgent need for identifying potential drugs for COVID-19. However, our understanding of the host-immune response to SARS-CoV-2 is limited, and there are only a few approved drugs available. To address this, a systems biology tool called PHENotype SIMulator has been introduced. This tool uses transcriptomic and proteomic databases to simulate SARS-CoV-2 infection in host cells, allowing for the identification of viral effects on host-immune response with high sensitivity and specificity (>96%).
Article
Mathematical & Computational Biology
Eva Viesi, Davide Stefano Sardina, Ugo Perricone, Rosalba Giugno
Summary: The World Health Organization estimates that 9 out of 10 people worldwide breathe air containing high levels of pollutants, which can have detrimental effects on vital organs. In order to investigate the link between pollutant exposure and human health effects, the development of an online resource collecting and characterizing pollutant molecules could be beneficial. The APDB database was created to collect air-pollutant-related data from various online resources, including molecules, targets, bioassays, and computed properties. The database provides a web interface for browsing, querying, and visualizing the data.
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION
(2023)
Article
Computer Science, Artificial Intelligence
Luca Gallo, Vito Latora, Alfredo Pulvirenti
Summary: Research on graph representation learning has been highly focused on single-layer graphs, and there is limited research on representation learning of multilayer structures without known inter-layer links. This study proposes MultiplexSAGE, a generalized algorithm capable of embedding multiplex networks and reconstructing intra-layer and inter-layer connectivity. Experimental analysis reveals that the quality of embedding is strongly influenced by the density and randomness of the graph's links in both simple and multiplex networks.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Biology
Simone Avesani, Eva Viesi, Luca Alessandri, Giovanni Motterle, Vincenzo Bonnici, Marco Beccuti, Raffaele Calogero, Rosalba Giugno
Summary: In this study, a new clustering method called Stardust was proposed, which can easily utilize spatial and transcriptomic information to improve clustering analysis. By analyzing ST datasets, the method showed excellent performance in clustering.
Article
Psychiatry
Antonino Petralia, Emanuele Bisso, Ilaria Concas, Antonino Maglitto, Nunzio Bucolo, Salvatore Alaimo, Andrea Di Cataldo, Maria Salvina Signorelli, Alfredo Pulvirenti, Eugenio Aguglia
Summary: This study investigated the relationship between defence styles and predisposition to psychiatric diseases in adults with a history of paediatric cancer, finding that survivors exhibited lower scores in certain defence styles and lower psychopathological indices compared to healthy controls. The results of mediation analysis indicated that specific defence styles had mediation effects on certain psychopathological outcomes, suggesting an indirect relationship between oncological pathology and psychopathology mediated by defence styles such as TAS and TAO. However, other defence styles did not show significant mediation effects on psychopathological outcomes.
GENERAL PSYCHIATRY
(2021)
Article
Biochemistry & Molecular Biology
Nicolas Munz, Luciano Cascione, Luca Parmigiani, Chiara Tarantelli, Andrea Rinaldi, Natasa Cmiljanovic, Vladimir Cmiljanovic, Rosalba Giugno, Francesco Bertoni, Sara Napoli
Summary: Under stressful conditions, cells activate a rescue program modulated by mTOR and rely on microRNAs and lncRNAs for translation regulation. Upregulation of lncRNA lncTNK2-2:1 may be associated with the stabilization of translation and DNA damage regulation in response to treatment with bimiralisib.
Article
Geochemistry & Geophysics
Andrea Cannata, Flavio Cannavo, Giuseppe Di Grazia, Marco Aliotta, Carmelo Cassisi, Raphael S. M. De Plaen, Stefano Gresta, Thomas Lecocq, Placido Montalto, Mariangela Sciotto
Summary: During the COVID-19 pandemic, countries implemented social interventions to restrict human mobility, leading to a decrease in anthropogenic seismic noise. Research found similarities in temporal patterns between the decrease in seismic noise and human mobility.
Proceedings Paper
Biochemical Research Methods
Vincenzo Bonnici, Simone Caligola, Antonino Aparo, Rosalba Giugno
COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, CIBB 2018
(2020)
Article
Computer Science, Information Systems
Alessio Cecconi, Luca Barbaro, Claudio Di Ciccio, Arik Senderovich
Summary: This paper introduces a framework for designing probabilistic measures for declarative process specifications, which can assess the degree of compliance between process data and specifications. Through experiments, the applicability of the approach for various process mining tasks is demonstrated.
INFORMATION SYSTEMS
(2024)
Article
Computer Science, Information Systems
Mahei Manhai Li, Philipp Reinhard, Christoph Peters, Sarah Oeste-Reiss, Jan Marco Leimeister
Summary: This article introduces a novel human-in-the-loop (HIL) design for ITSM support ticket recommendations by incorporating a value co-creation perspective. The design incentivizes ITSM agents to provide labels during their everyday ticket-handling procedures, and the evaluation shows that recommendations after label improvement have increased user ratings.
INFORMATION SYSTEMS
(2024)
Article
Computer Science, Information Systems
Anton Yeshchenko, Jan Mendling
Summary: This paper presents the development of event sequence data analysis techniques in different fields and proposes an integrated framework to facilitate collaboration and research synergy across various domains.
INFORMATION SYSTEMS
(2024)
Article
Computer Science, Information Systems
Iris Reinhartz-Berger, Alan Hartman, Doron Kliger
Summary: Many IT departments provide solutions that partially meet the needs of business units. This research aims to develop a data-driven analysis method to support the selection of solutions with higher prospects of adoption and identify design gaps and barriers.
INFORMATION SYSTEMS
(2024)
Article
Computer Science, Information Systems
Orlenys Lopez-Pintado, Marlon Dumas, Jonas Berx
Summary: Business process simulation is a versatile technique that predicts the impact of changes on process performance. However, previous approaches have limitations due to their treatment of resources as undifferentiated entities. This article addresses this issue by proposing a new simulation approach that treats each resource as an individual entity with its own performance and availability. The article also presents methods for discovering simulation models with differentiated resources and optimizing resource availability calendars. Empirical evaluation demonstrates that differentiated resource models better replicate cycle time distributions and work rhythm, and iterative optimization of resource allocations and calendars leads to improved cost-time tradeoffs.
INFORMATION SYSTEMS
(2024)