Article
Computer Science, Artificial Intelligence
Till Hendrik Schulz, Tamas Horvath, Pascal Welke, Stefan Wrobel
Summary: Weisfeiler-Lehman graph kernels are still one of the most prevalent graph kernels after more than a decade, thanks to their impressive predictive performance and time complexity. However, their binary comparison based on label equality may be too rigid for certain graph classes. To address this limitation, we propose a generalization of the Weisfeiler-Lehman graph kernels that considers a more natural and fine-grained similarity between labels. We demonstrate that this similarity can be efficiently calculated using the Wasserstein distance between vectors representing the labels. Our generalization outperforms other state-of-the-art graph kernels in terms of predictive performance on datasets with structurally complex graphs.
Article
Computer Science, Artificial Intelligence
Samuel Genheden, Ola Engkvist, Esben Bjerrum
Summary: This article expands on recent research on clustering synthetic routes and trains a deep learning model to predict distances between different routes. The machine learning approach used in this study is considerably faster than the traditional tree edit distance method and allows for clustering a greater number of routes with similar results. The developed model is also open-source.
MACHINE LEARNING-SCIENCE AND TECHNOLOGY
(2022)
Article
Computer Science, Artificial Intelligence
R. Rueda, M. P. Cuellar, L. G. B. Ruiz, M. C. Pegalajar
Summary: Finding a balance between diversity and convergence is crucial in evolutionary algorithms, especially for solving symbolic regression problems. This paper proposes a similarity measure based on edit distance and combines it with the CHC algorithm strategy to control diversity in the population, thus avoiding local optima.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Computer Science, Information Systems
Liyu Huang, Qingfeng Chen, Yongjie Li, Cheng Luo
Summary: In this paper, a novel RNA secondary structure model called modified adjoining grammars binary tree (BTMGcsp) is proposed to intuitively represent complex RNA secondary structures while preserving structural properties. The method substantially reduces memory and time consumption, with experimental results showing a high AUC value of 0.949 in PseudoBase.
Article
Computer Science, Software Engineering
Raghavendra Sridharamurthy, Vijay Natarajan
Summary: Comparative analysis of scalar fields is important for various applications such as feature-directed visualization and feature tracking in time-varying data. Comparing topological structures of scalar fields provides faster and more meaningful comparisons. While global measures exist for comparing topological structures, there is a lack of measures for local comparison. This study presents a local variant of the tree edit distance to enable fine-grained analysis of merge trees, with experimental results demonstrating its utility in different applications.
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS
(2023)
Article
Computer Science, Information Systems
David B. Blumenthal, Sebastien Bougleux, Anton Dignos, Johann Gamper
Summary: This paper introduces a method for computing K dissimilar minimum cost bipartite matchings and proves that the problem is NP-hard. It presents heuristics based on greedy dynamic programming and shows that these techniques outperform existing algorithms in terms of dissimilarity of the obtained matchings and improve the upper bounds of state-of-the-art algorithms for graph edit distance computation based on bipartite data matching.
INFORMATION SCIENCES
(2022)
Article
Computer Science, Information Systems
Mohammed Hadwan
Summary: Nowadays, people refer to celebrities and experts not only by their real names but also by their aliases on the web. This research proposes a reliable algorithm to detect aliases resulting from the transliteration of Arabic names into English, with improvements in calculating substitution and transposition costs. Testing shows that this algorithm outperforms others in achieving a better average percentage of similarity.
Article
Mathematics, Applied
Eliska Sestakova, Ondrej Guth, Jan Janousek
Summary: Given an input tree and a tree pattern, the inexact tree pattern matching problem is to find all subtrees in the input tree that match the tree pattern with up to k errors. The proposed solution is based on a finite automaton that reads the input tree represented in linear, prefix bar notation. The deterministic version of the finite automaton finds all inexact occurrences of the tree pattern in linear time to the size of the input tree.
DISCRETE APPLIED MATHEMATICS
(2023)
Article
Computer Science, Information Systems
Dan Zhu, Hui Zhu, Xiangyu Wang, Rongxing Lu, Dengguo Feng
Summary: In the past decade, genomic data has exponentially grown and is widely used in medical and health-related applications, providing new opportunities for the field of medicine. Similar patients query (SPQ) is a popular application that helps physicians formulate optimal therapies. However, ensuring privacy protection becomes crucial for the success of SPQ services due to the sensitive nature of human genomes.
IEEE TRANSACTIONS ON CLOUD COMPUTING
(2023)
Article
Urban Studies
Jianxin Yang, Shengbing Yang, Jingjing Li, Jian Gong, Man Yuan, Jingye Li, Yunzhe Dai, Jing Ye
Summary: This study presents a distance-driven urban simulation model that regulates urban morphology to simulate expansion, bypassing the need for extensive data and skills required in traditional urban expansion models. The model was successfully applied in Wuhan, China, providing a distance-driven framework for exploring urban dynamics when data and skills on urban expansion mechanisms are limited.
Article
Computer Science, Information Systems
Tao Qiu, Chuanyu Zong, Xiaochun Yang, Bin Wang, Bing Li
Summary: This paper addresses the problem of similar substring matching with edit distance constraints. Existing methods have an issue with balancing the cost of filtering and verification, so a new hierarchical filtering paradigm is proposed to address this issue. The cost is further reduced by eliminating duplicate filters.
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS
(2023)
Article
Computer Science, Hardware & Architecture
Yandong Zheng, Rongxing Lu, Yunguo Guan, Jun Shao, Hui Zhu
Summary: Similarity query over time series data is important in various applications. Existing solutions still have issues in supporting queries with different lengths, and have limitations in query accuracy and efficiency. In this article, we propose a new efficient and privacy-preserving similarity range query scheme using the time warp edit distance (TWED) as the similarity metric. Our scheme leverages a kd-tree and symmetric homomorphic encryption technique to improve query efficiency and protect data privacy.
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING
(2022)
Article
Chemistry, Multidisciplinary
Kai Frerich, Mark Bukowski, Sandra Geisler, Robert Farkas
Summary: This study combines taxonomic and textual information to develop an ensemble classification system for patent categorization, achieving nearly 10 points higher performance when compared to basic classifiers. The classifiers are trained on patents' title/abstract and CPC, IPC assignments, with the taxonomies transformed into real-valued vectors through DSE. The ensemble of classifiers, particularly when combined with a feed-forward ANN, outperforms individual classifiers and offers new possibilities for technology management.
APPLIED SCIENCES-BASEL
(2021)
Article
Computer Science, Artificial Intelligence
Hammad Majeed, Abdul Wali, Mirza Beg
Summary: The study proposes a technique based on partial derivatives to evaluate the impact of a subtree on the output of a GP tree and reduces semantic errors by defining an impact-aware crossover operator. Through comparison, the proposed technique demonstrates higher efficiency and reliability across all tested problems.
SWARM AND EVOLUTIONARY COMPUTATION
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Peter Bednar
Summary: This study applied edit distance functions to measure morphological ambiguity and similarity between natural languages by comparing both morphological and syntactical annotations of words in sentences. Experiments conducted with Slovak as the reference language and a set of Slavic languages showed the effectiveness of this method within the Universal dependencies framework.
2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021)
(2021)
Review
Biochemical Research Methods
Vladimir Gligorijevic, Noel Malod-Dognin, Natasa Przulj
Editorial Material
Multidisciplinary Sciences
Natasa Przulj, Noel Malod-Dognin
Article
Biochemical Research Methods
Thomas Gaudelet, Noel Malod-Dognin, Natasa Przulj
Article
Multidisciplinary Sciences
Noel Malod-Dognin, Kristina Ban, Natasa Przulj
SCIENTIFIC REPORTS
(2017)
Article
Multidisciplinary Sciences
Noel Malod-Dognin, Julia Petschnigg, Sam F. L. Windels, Janez Povh, Harry Hemmingway, Robin Ketteler, Natasa Przulj
NATURE COMMUNICATIONS
(2019)
Article
Biochemical Research Methods
Sam F. L. Windels, Noel Malod-Dognin, Natasa Przulj
Correction
Multidisciplinary Sciences
Noel Malod-Dognin, Julia Petschnigg, Sam F. L. Windels, Janez Povh, Harry Hemingway, Robin Ketteler, Natasa Przulj
NATURE COMMUNICATIONS
(2019)
Article
Chemistry, Medicinal
Natasa Perin, Valentina Rep, Irena Sovic, Stefica Juricic, Danijel Selgrad, Marko Klobucar, Natasa Przulj, Chhedi Lal Gupta, Noel Malod-Dognin, Sandra Kraljevic Pavelic, Marijana Hranjec
EUROPEAN JOURNAL OF MEDICINAL CHEMISTRY
(2020)
Article
Multidisciplinary Sciences
Thomas Gaudelet, Noel Malod-Dognin, Jon Sanchez-Valle, Vera Pancaldi, Alfonso Valencia, Nataga Przulj
Article
Biochemical Research Methods
N. Malod-Dognin, V Pancaldi, A. Valencia, N. Przulj
Article
Biochemical Research Methods
Jose Lugo-Martinez, Daniel Zeiberg, Thomas Gaudelet, Noel Malod-Dognin, Natasa Przulj, Predrag Radivojac
Summary: This study introduces a hypergraph-based approach for modeling biological systems and formulates vertex classification, edge classification, and link prediction problems on (hyper)graphs. It also presents a novel kernel method on vertex- and edge-labeled hypergraphs for analysis and learning.
Article
Biochemical Research Methods
Sergio Doria-Belenguer, Markus K. Youssef, Rene Bottcher, Noel Malod-Dognin, Natasa Przulj
Article
Biochemical Research Methods
A. Xenos, N. Malod-Dognin, S. Milinkovic, N. Przulj
Summary: This study introduces algorithms based on network embeddings to untangle the complexity of omics data and mine them for new biomedical information. By decomposing matrices with Nonnegative Matrix Tri-Factorization, the study demonstrates that genes with similar biological functions are embedded close in space and can extract new biomedical knowledge through linear operations on their vector representations. The method successfully predicts new genes participating in protein complexes and identifies cancer-related genes with potential clinical relevance based on cosine similarities between vector representations.
Article
Mathematical & Computational Biology
Noel Malod-Dognin, Natasa Przulj
JOURNAL OF INTEGRATIVE BIOINFORMATICS
(2017)
Article
Biochemical Research Methods
Omer Nebil Yaveroglu, Noel Malod-Dognin, Tijana Milenkovic, Natasa Przulj