Article
Computer Science, Information Systems
Hui Xu, Qicheng Liu
Summary: This paper proposes an algorithm based on density peaks clustering and fitness to address the low classification accuracy of the minority class in imbalanced data. Experimental results show that the algorithm outperforms other algorithms.
Article
Computer Science, Theory & Methods
Ankit Srivastava, Sriram P. Chockalingam, Srinivas Aluru
Summary: This article presents a parallel framework for scaling Bayesian network structure learning algorithms to tens of thousands of variables. The framework parallelizes three different algorithms and is able to construct large-scale networks from real data sets in less than a minute on 1024 cores, achieving significant speedup and efficiency. The scalability of the framework is also demonstrated using simulated data sets.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Mustafa A. Kocak, David Ramirez, Elza Erkip, Dennis E. Shasha
Summary: SafePredict is a novel meta-algorithm that works with any base prediction algorithm to guarantee a chosen correctness rate by allowing refusals. It does not rely on assumptions about data distribution or base predictor and adapts to changes in the base predictor's error rate without knowing when the changes occur.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2021)
Article
Computer Science, Artificial Intelligence
Jaiprakash Nagar, Sanjay Kumar Chaturvedi, Sieteng Soh, Abhilash Singh
Summary: This study proposes a machine learning approach based on the generalized regression neural network (GRNN) to predict the k-coverage performance of wireless multihop networks (WMNs) placed in a rectangular region. The proposed approach achieves better prediction accuracy and lower computational time complexity compared to existing benchmark algorithms in both scenarios with and without boundary effects (BEs).
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Computer Science, Artificial Intelligence
Rui Zhang, Hongyuan Zhang, Xuelong Li
Summary: The article proposes a new clustering framework that aims to maximize the joint probability of data and parameters, and can use a prior distribution to measure the rationality of different representations.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Feng Wang, Tao Kong, Rufeng Zhang, Huaping Liu, Hang Li
Summary: This paper presents Twist, a self-supervised representation learning method that classifies large-scale unlabeled datasets in an end-to-end manner. The authors use a siamese network with a softmax operation to generate twin class distributions for augmented images. By maximizing the mutual information between input images and output class predictions, Twist avoids collapsed solutions and achieves state-of-the-art performance on various tasks. On the semi-supervised classification task, Twist outperforms previous methods by 6.2% improvement in top-1 accuracy using 1% ImageNet labels with a ResNet-50 backbone. Codes and pre-trained models are available at https://github.com/bytedance/TWIST.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2023)
Article
Automation & Control Systems
Gao Huang, Chaoqun Du
Summary: This paper proposes a novel assumption and algorithm for semi-supervised learning, which complements the common low-density separation assumption and solves the transductive label assignment problem. Experimental results show that the proposed algorithm achieves competitive performance on multiple datasets and is almost one order of magnitude faster than existing SSL approaches.
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS
(2022)
Article
Automation & Control Systems
Emily A. A. Reed, Guilherme Ramos, Paul Bogdan, Sergio Pequito
Summary: In this article, a scalable distributed solution is proposed for finding strongly connected components (SCCs) and the diameter of a directed network. The solution leverages dynamical consensus-like protocols and has a time complexity of O(NDd(max) (in-degree)), where N is the number of vertices, D is the network diameter, and d(max) (in-degree) is the maximum in-degree. It is proven that the algorithm terminates in D + 2 iterations, allowing the retrieval of the finite network diameter. Exhaustive simulations demonstrate the outperformance of the proposed algorithm on various random networks.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL
(2023)
Article
Geochemistry & Geophysics
Ji Chang, Yu Kang, Zerui Li, Wei Xing Zheng, Wenjun Lv, De-Yong Feng
Summary: Cross-domain lithology identification is a challenging problem that aims to predict the lithology of an uninterpreted well using logging data from an interpreted well. In this study, we propose a novel framework that combines active learning and domain adaptation to address the issues of data distribution shift and expensive label acquisition. Experimental results demonstrate that our method effectively suppresses performance degradation caused by data distribution shift and requires fewer target label queries.
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
(2022)
Article
Computer Science, Artificial Intelligence
Yingying Chen, Zijie Hong, Xiaowei Yang
Summary: This article introduces a cost-sensitive online adaptive kernel learning algorithm to address large-scale imbalanced classification problems. It proposes a misclassification cost to balance the accuracy between the minority class and the majority class. Experimental results demonstrate that the algorithm significantly improves classification performance on most large-scale imbalanced datasets.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Artificial Intelligence
De Cheng, Jingyu Zhou, Nannan Wang, Xinbo Gao
Summary: This paper introduces a hybrid dynamic cluster contrast and probability distillation algorithm for unsupervised person re-identification. The algorithm makes use of the self-supervised signals of both clustered and un-clustered instances, as well as informative and valuable training examples, for effective and robust training.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2022)
Article
Water Resources
K. L. Chong, Y. F. Huang, C. H. Koo, Mohsen Sherif, Ali Najah Ahmed, Ahmed El-Shafie
Summary: Streamflow forecasting is crucial in water resources management, and this paper explores the use of machine learning algorithms for two distinct streamflow forecasting problems. The study finds that categorical-based streamflow forecast outperforms regression-based forecast, and forest-based algorithms are superior for predicting high streamflow fluctuations with low-dimensional input. Furthermore, encoding streamflow time series as images for forecasting demands further analysis as different approaches yield varying results.
APPLIED WATER SCIENCE
(2023)
Article
Computer Science, Artificial Intelligence
Zengyou He, Wenfang Chen, Xiaoqi Wei, Yan Liu
Summary: As one of the most important topics in data mining and network science, community detection problem has been extensively studied. However, determining the statistical significance of an individual community in a weighted network remains unsolved. In this study, a new method is proposed to calculate the analytical p-value of an individual community in weighted networks, and it is utilized as the objective function in a local search procedure to derive a new community detection algorithm. Experimental results demonstrate that the new algorithm achieves comparable performance to state-of-the-art algorithms for identifying communities in weighted networks.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Information Systems
Ibomoiye Domor Mienye, Yanxia Sun
Summary: Ensemble learning techniques have achieved state-of-the-art performance by combining predictions from multiple base models, with a focus on widely used algorithms such as random forest, AdaBoost, gradient boosting, XGBoost, LightGBM, and CatBoost. This overview aims to provide concise coverage of their mathematical and algorithmic representations, lacking in existing literature, for the benefit of machine learning researchers and practitioners.
Article
Computer Science, Artificial Intelligence
Shi-Xue Zhang, Xiaobin Zhu, Lei Chen, Jie-Bo Hou, Xu-Cheng Yin
Summary: Arbitrary shape text detection is a challenging task, but segmentation-based methods using probability maps show promising results in accurately detecting text instances. This paper proposes an innovative and robust segmentation-based detection method that uses Sigmoid Alpha Functions to transfer distances into probability maps, and a group of probability maps to cover complex probability distributions. The method achieves state-of-the-art performance in terms of detection accuracy on several benchmarks, including multi-oriented and multilingual datasets.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Computer Science, Information Systems
David Atienza, Concha Bielza, Pedro Larranaga
Summary: Semiparametric Bayesian networks combine parametric and nonparametric conditional probability distributions to incorporate the advantages of both components. By considering different types of conditional probability distributions and modifying learning algorithms, the proposed approach achieves comparable performance to state-of-the-art methods.
INFORMATION SCIENCES
(2022)
Article
Computer Science, Artificial Intelligence
Fernando Rodriguez-Sanchez, Concha Bielza, Pedro Larranaga
Summary: This paper introduces a multipartition clustering method for mixed data, which efficiently handles multifaceted data with several reasonable interpretations by utilizing Bayesian network factorization and the variational Bayes framework.
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
David Quesada, Concha Bielza, Pedro Fontan, Pedro Larranaga
Summary: When modeling multivariate continuous time series, it is common to encounter nonlinear processes or drift away from the original distribution. To address this issue, we propose a hybrid model that combines a model tree with DBNs to obtain nonlinear forecasts. Experimental results demonstrate that our model outperforms standard DBN models when dealing with nonlinear processes and is competitive with state-of-the-art time series forecasting methods.
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
A. Abanda, U. Mori, Jose A. Lozano
Summary: This study investigates time series classifier recommendation for the first time, considering various recommendation forms or meta-targets. The researchers design a set of quick estimators as predictors for the recommendation system. Experimental results show that the proposed method outperforms other methods in most scenarios, and a hierarchical inference method for meta-targets is also proposed.
PATTERN RECOGNITION
(2022)
Article
Computer Science, Artificial Intelligence
Abolfazl Shirazi, Josu Ceberio, Jose A. Lozano
Summary: This article introduces a new algorithm (EDA++) equipped with mechanisms to handle nonlinear constraints by adopting the framework of estimation of distribution algorithms (EDAs). The study shows that the feasibility of the final solutions is guaranteed and the quality of the solutions in terms of objective values is improved by seeding an initial population of feasible solutions to the algorithm.
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
(2022)
Article
Computer Science, Information Systems
Carlos Puerto-Santana, Concha Bielza, Javier Diaz-Rozo, Guillem Ramirez-Gargallo, Filippo Mantovani, Gaizka Virumbrales, Jesus Labarta, Pedro Larranaga
Summary: This study introduces a methodology for health assessment based on online novelty detection and asymmetrical hidden Markov models for predicting the remaining useful life of ball bearings in industrial assets. The approach is designed to adapt to natural degradation of mechanical components and can be deployed in online environments. Performance analysis and validation with real datasets showcase the advantages of this methodology.
IEEE INTERNET OF THINGS JOURNAL
(2022)
Article
Computer Science, Artificial Intelligence
Jairo Rojas-Delgado, Josu Ceberio, Borja Calvo, Jose A. Lozano
Summary: This work delves into the Bayesian statistical assessment of experimental results, proposing a framework for analyzing multiple algorithms on multiple problems/instances by transforming experimental results into rankings and estimating the posterior distribution of the parameters of probability models. Various inferences regarding algorithm rankings are examined, and a Python package and source code implementation are provided for other researchers to utilize.
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION
(2022)
Article
Computer Science, Artificial Intelligence
Anehd Blazquez-Garcia, Kristoffer Wickstrom, Shujian Yu, Karl Oyvind Mikalsen, Ahcene Boubekki, Angel Conde, Usue Mori, Robert Jenssen, Jose A. Lozano
Summary: This paper proposes a selective imputation method for handling missing values in multivariate time series data. By using multi-objective optimization techniques, the method selects the time points to impute in order to reduce imputation uncertainty and accurately represent the original time series. Experimental results show that this method can improve the performance of downstream tasks while maintaining the quality of the imputations.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Interdisciplinary Applications
Onintze Zaballa, Aritz Perez, Elisa Gomez Inhiesto, Teresa Acaiturri Ayesta, Jose A. Lozano
Summary: This paper presents a probabilistic generative model for disease modeling and patient treatment based on Electronic Health Records. The model aims to identify different subtypes of treatments for a given disease and discover their development and progression. It considers the hierarchical structure of latent variables to classify and segment the treatment sequences. The model's learning procedure is efficiently solved with the Expectation-Maximization algorithm based on dynamic programming. The evaluation includes recovering the generative model underlying synthetic data and assessing the model's ability to provide treatment classification and staging information in real-world data. The model can be used for classification, simulation, data augmentation, and missing data imputation.
JOURNAL OF BIOMEDICAL INFORMATICS
(2023)
Review
Computer Science, Artificial Intelligence
Carlos Villa-Blanco, Concha Bielza, Pedro Larranaga
Summary: Real-world problems often have high feature dimensionality, making it difficult to model and analyze the data. Feature subset selection (FSS) techniques can be used to reduce irrelevant or redundant information, improving the speed and performance of building models. This review focuses on incremental FSS algorithms that can efficiently handle large volumes of data received sequentially. Different strategies, such as updating feature weights incrementally, applying information theory, or using rough set-based FSS, are discussed, along with various supervised and unsupervised learning tasks where FSS is applicable.
ARTIFICIAL INTELLIGENCE REVIEW
(2023)
Article
Computer Science, Artificial Intelligence
Josu Ircio, Aizea Lojo, Usue Mori, Simon Malinowski, Jose A. Lozano
Summary: This paper addresses imbalanced time series classification problems and proposes a method for learning time series classifiers that maximize the minimum recall rather than accuracy. By applying several smooth approximations of the minimum recall function, our approach improves the performance of state-of-the-art methods in imbalanced time series classification, with only a slight loss in accuracy.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Xabier Benavides, Josu Ceberio, Leticia Hernando, Jose A. Lozano
Summary: Previous works have shown that studying the characteristics of the Quadratic Assignment Problem (QAP) is crucial in designing tailored meta-heuristic algorithms. This study focuses on the Elementary Landscape Decomposition (ELD) method, which is widely used but lacks a clear understanding of its measurement components. To address this issue, this work further decomposes the ELD and conducts experiments to explain the behavior of ELD-based methods, providing critical information about their potential applications.
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023
(2023)
Article
Biochemical Research Methods
Niko Bernaola, Mario Michiels, Pedro Larranaga, Concha Bielza
Summary: We present FGES-Merge, a new method for learning the structure of gene regulatory networks by merging locally learned Bayesian networks using the fast greedy equivalent search algorithm. The method is competitive in terms of accuracy and speed, scaling up to large networks and incorporating empirical knowledge of gene regulatory network topology. We also introduce a visualization tool for exploring massive networks and identifying nodes of interest. Our work contributes to predicting gene interactions on a large scale and provides a valuable resource for future biological research.
PLOS COMPUTATIONAL BIOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Carlos Puerto-Santana, Pedro Larranaga, Concha Bielza
Summary: This article introduces asymmetric hidden Markov models with feature saliencies, which are capable of simultaneously determining relevant variables/features and probabilistic relationships between variables during their learning phase. Comparing with other approaches, the proposed models have better or equal fitness and provide further data insights.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Carlos Puerto-Santana, Pedro Larranaga, Javier Diaz-Rozo, Concha Bielza
Summary: This paper focuses on data streams produced by sensors in industrial environments and proposes an online feature subset selection methodology based on HMM to determine the relevant fundamental and harmonic frequencies during operation of ball-bearings.
16TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING MODELS IN INDUSTRIAL AND ENVIRONMENTAL APPLICATIONS (SOCO 2021)
(2022)