Article
Computer Science, Artificial Intelligence
He Kong, Xiaohu Shi, Limin Wang, Yang Liu, Musa Mammadov, Gaojie Wang
Summary: The paper proposes a novel approach, averaged tree-augmented one-dependence estimators (ATODE), which relaxes the independence assumption of AODE by exploring higher-order conditional dependencies between attributes. Experimental results on 36 datasets demonstrate that the proposed approach can achieve competitive or better classification performance compared to state-of-the-art learners.
APPLIED INTELLIGENCE
(2021)
Article
Economics
Karim Barigou, Pierre-Olivier Goffard, Stephane Loisel, Yahia Salhi
Summary: Predicting the evolution of mortality rates is crucial for life insurance and pension funds. This study proposes a Bayesian negative-binomial framework for mortality modeling to account for overdispersion and parameter uncertainty. Model averaging techniques are employed to address model misspecifications. Two out-of-sample validation methods are proposed and compared with standard Bayesian model averaging. Numerical simulations and real-life mortality datasets demonstrate that the proposed methods outperform the standard approach in terms of prediction performance and robustness.
INTERNATIONAL JOURNAL OF FORECASTING
(2023)
Article
Computer Science, Artificial Intelligence
Limin Wang, Shuai Zhang, Musa Mammadov, Kuo Li, Xinhao Zhang, Siyuan Wu
Summary: In this paper, a novel weighting paradigm called semi-supervised weighting (SSW) is proposed in the framework of AODE, achieving a balance between the ground-truth dependencies approximation and the effectiveness of probability estimation through supervised and unsupervised weighted AODEs. The experimental results on benchmark datasets validate the effectiveness and robustness of SSW for weighting AODE in terms of zero-one loss, bias, variance and etc.
APPLIED INTELLIGENCE
(2022)
Article
Computer Science, Artificial Intelligence
Shenglei Chen, Xin Ma, Linyuan Liu, Limin Wang
Summary: This paper proposes a new attribute ranking approach, MMCMI, which takes into account the redundancies among attributes. Experimental results demonstrate that the MMCMI approach achieves significantly improved classification performance and reduces classification time.
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE
(2023)
Article
Statistics & Probability
Boxiang Wang, Hui Zou
Summary: Inspired by the Golub-Heath-Wahba formula for ridge regression, a new leave-one-out lemma is introduced for kernel SVM and related large-margin classifiers. An efficient algorithm named magicsvm is designed to train kernel SVM, compute exact leave-one-out cross-validation error, and improve computation speed compared to state-of-the-art SVM solvers. This method also enhances the computation speed of V-fold cross-validation for kernel classifiers.
Article
Ecology
Carles Mila, Jorge Mateu, Edzer Pebesma, Hanna Meyer
Summary: This study proposes a new cross-validation strategy that takes into account the geographical prediction space and compares it with other established methods. The new method, called NNDM LOO CV, provides reliable estimates in all scenarios considered. The existing methods, LOO and bLOO CV, have limitations and only provide accurate estimates in certain situations. Therefore, considering the geographical prediction space is essential when designing map validation methods.
METHODS IN ECOLOGY AND EVOLUTION
(2022)
Article
Computer Science, Artificial Intelligence
Limin Wang, Yibin Xie, Meng Pang, Junyang Wei
Summary: This research focuses on improving the performance of Bayesian network classifiers by using a double weighting scheme in AODE. Experimental evaluations show that attribute weighting and model weighting are complementary, and DWAODE demonstrates significant advantages in terms of zero-one loss, bias-variance decomposition, RMSE, Friedman and Nemenyi tests.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Astronomy & Astrophysics
Ryan C. Challener, Luis Welbanks, Peter McGill
Summary: This research investigates eclipse mapping using Leave-one-out Cross Validation (LOO-CV) and applies it to the observation of WASP-18b. The study shows that constraints on planetary brightness patterns are influenced by phase-curve variation and the shape of eclipse ingress and egress. Additionally, the research explores the relationship between different brightness map components under a positive-flux constraint.
ASTRONOMICAL JOURNAL
(2023)
Article
Engineering, Multidisciplinary
Yong Pang, Yitang Wang, Xiaonan Lai, Shuai Zhang, Pengwei Liang, Xueguan Song
Summary: This paper proposes an enhanced-LOOCV method that improves the accuracy and efficiency of the traditional LOOCV in the model estimation and selection of the Kriging surrogate model for engineering problems. By incorporating hyperparameters from the complete Kriging model, it reduces the number of hyperparameter optimizations and achieves better estimation performance. The proposed decremental calculation also reduces computational costs and improves the time complexity of the traditional LOOCV.
COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING
(2023)
Article
Geosciences, Multidisciplinary
Thi Lan Anh Dinh, Filipe Aires
Summary: The use of statistical models to study the impact of weather on crop yield has been increasing. However, the limited number of samples in this type of application makes it challenging to utilize traditional methods for dataset splitting and model selection. In this study, a nested leave-two-out cross-validation method is proposed to choose the best model and demonstrate that simpler models perform better when considering reliable generalization testing. This approach is significant for statistical crop modeling.
GEOSCIENTIFIC MODEL DEVELOPMENT
(2022)
Article
Operations Research & Management Science
Lizhi Wang
Summary: Cross validation is a widely used method to evaluate the performance of prediction models. Leave-k-out and m-fold are popular cross validation criteria, with different strengths and limitations. This study proposes a new criterion, leave-worst-k-out, to combine the advantages of leave-k-out and m-fold. Experimental results show that the leave-worst-k-out criterion outperforms leave-k-out and m-fold in assessing the generalizability of prediction models. It can be efficiently computed using a random sampling algorithm.
OPTIMIZATION LETTERS
(2023)
Article
Computer Science, Artificial Intelligence
Li-Min Wang, Peng Chen, Musa Mammadov, Yang Liu, Si-Yuan Wu
Summary: A novel weighted AODE algorithm (AWODE) is proposed in this study, which adaptively selects weights to alleviate the independence assumption and make the learned probability distribution fit the instance. Experimental results demonstrate that this approach achieves bias-variance trade-off on 40 benchmark datasets.
INTELLIGENT DATA ANALYSIS
(2021)
Article
Computer Science, Information Systems
Faten Khalid Karim, Hela Elmannai, Abdelrahman Seleem, Safwat Hamad, Samih M. Mostafa
Summary: Handling missing values and feature selection are crucial preprocessing tasks for pattern recognition and machine learning applications. This paper proposes a new algorithm called CBRSL, which effectively manipulates missing values using feature selection. Experimental results demonstrate that CBRSL outperforms other imputation methods in terms of accuracy and can handle missing values generated from various mechanisms.
Article
Statistics & Probability
Paul-Christian Burkner, Jonah Gabry, Aki Vehtari
Summary: This study introduces a method to efficiently compute and validate exact and approximate LOO-CV for any Bayesian non-factorized model with a multivariate normal or Student-t distribution. The method is demonstrated using lagged simultaneously autoregressive (SAR) models as a case study.
COMPUTATIONAL STATISTICS
(2021)
Article
Chemistry, Physical
Xiao Liu, Mengxian Yu, Qingzhu Jia, Fangyou Yan, Yin-Ning Zhou, Qiang Wang
Summary: In this study, a rigorous evaluation method called LOIO-CV is proposed to evaluate the thermodynamic properties of ILs. Twof(T,P,I)-QSPR models are developed with norm indexes to predict the density and viscosity of ILs at variable temperatures and pressures by balancing the distribution of data points in ILs. Our LOIO-CV method enhances the stability of QSPR models when predicting the properties of ILs with novel cations and anions, which is crucial for the data-driven design of new ILs.
JOURNAL OF MOLECULAR LIQUIDS
(2023)
Article
Computer Science, Artificial Intelligence
Benjamin Lucas, Charlotte Pelletier, Daniel Schmidt, Geoffrey I. Webb, Francois Petitjean
Summary: Land cover maps are essential for environmental research and management. This paper presents Sourcerer, a semi-supervised domain adaptation technique that uses deep learning to generate land cover maps from satellite image time series data. Experimental results show that Sourcerer achieves high accuracy even with limited labeled target data.
Review
Biochemical Research Methods
Fuyi Li, Shuangyu Dong, Andre Leier, Meiya Han, Xudong Guo, Jing Xu, Xiaoyu Wang, Shirui Pan, Cangzhi Jia, Yang Zhang, Geoffrey Webb, Lachlan J. M. Coin, Chen Li, Jiangning Song
Summary: Conventional supervised binary classification algorithms have been widely used in biological and biomedical data analysis. However, labeling data can be laborious, leading to the proposal of the positive unlabeled (PU) learning scheme. This approach allows for learning from limited positive samples and a large number of unlabeled samples, contributing to the development of various PU learning algorithms for addressing biological questions.
BRIEFINGS IN BIOINFORMATICS
(2022)
Review
Biochemical Research Methods
Meng Zhang, Cangzhi Jia, Fuyi Li, Chen Li, Yan Zhu, Tatsuya Akutsu, Geoffrey Webb, Quan Zou, Lachlan J. M. Coin, Jiangning Song
Summary: This study provides benchmark datasets for promoter prediction in 58 different species, and finds that deep learning and traditional machine learning-based approaches generally outperform scoring function-based approaches.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Economics
Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I. Webb, Pablo Montero-Manso
Summary: Nowadays, accurate forecasts for weekly time series are needed in many businesses and industries. However, the current forecasting literature lacks easy-to-use, automatic, reproducible, and accurate approaches for this task. To address this gap, we propose a forecasting method that leverages state-of-the-art techniques, including forecast combination, meta-learning, and global modeling. Our proposed method, based on a stacking approach with lasso regression, outperforms benchmarks and state-of-the-art models in terms of accuracy and consistently produces the most accurate forecasts for the M4 weekly dataset.
INTERNATIONAL JOURNAL OF FORECASTING
(2023)
Article
Pharmacology & Pharmacy
Monica Jung, Dickson Lukose, Suzanne Nielsen, J. Simon Bell, Geoffrey Webb, Jenni Ilomaki
Summary: The COVID-19 pandemic has disrupted healthcare seeking and delivery, and different Australian jurisdictions implemented varying restrictions. Analyzing national pharmacy dispensing data in Australia, it was found that after nationwide COVID-19 restrictions, the incidence and prevalence of opioid dispensing decreased in Victoria, New South Wales, and other jurisdictions. However, in Victoria post-lockdown, both the incidence and prevalence increased. There were no significant changes in the initiation of long-term opioid use in any jurisdiction. More stringent restrictions were associated with greater reductions in overall opioid initiation, but not in long-term opioid use initiation.
BRITISH JOURNAL OF CLINICAL PHARMACOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Chang Wei Tan, Matthieu Herrmann, Geoffrey I. Webb
Summary: Nearest neighbour similarity measures are widely used in time series data analysis applications. This paper proposes ULTRA-FASTMPSEARCH, a family of algorithms for learning meta-parameters for different types of time series distance measures. These algorithms are significantly faster than the previous state of the art.
KNOWLEDGE AND INFORMATION SYSTEMS
(2023)
Review
Cardiac & Cardiovascular Systems
Adam C. Livori, Dickson Lukose, J. Simon Bell, Geoffrey I. Webb, Jenni Ilomaki
Summary: COVID-19 restrictions did not result in significant changes in the incidence, prevalence, or adherence to statins in Australia. Adaptive interventions, such as telehealth consultations and medication delivery, successfully maintained access to cardiovascular medications.
CURRENT PROBLEMS IN CARDIOLOGY
(2023)
Article
Computer Science, Artificial Intelligence
Ahmed Shifaz, Charlotte Pelletier, Francois Petitjean, Geoffrey I. Webb
Summary: This paper presents multivariate versions of seven commonly used elastic similarity and distance measures for time series data analytics. These measures can compensate for misalignments in the time axis of time series data. The paper adapts two existing strategies used in multivariate Dynamic Time Warping to these measures. Demonstrating their utility in multivariate time series classification using the nearest neighbor classifier, the paper shows that each measure achieves the highest accuracy on at least one dataset, supporting the value of developing a suite of multivariate similarity and distance measures. The paper also constructs a nearest neighbor-based ensemble of the measures, which proves to be competitive with other state-of-the-art single-strategy multivariate time series classifiers.
KNOWLEDGE AND INFORMATION SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Rakshitha Godahewa, Geoffrey I. I. Webb, Daniel Schmidt, Christoph Bergmeir
Summary: This paper explores the close connections between Threshold Autoregressive (TAR) models and regression trees. It introduces a new forecasting-specific tree algorithm called SETAR-Tree, which trains global Pooled Regression (PR) models in the leaves to learn cross-series information. The proposed tree and forest models outperform state-of-the-art tree-based algorithms and forecasting benchmarks in terms of accuracy.
Article
Computer Science, Artificial Intelligence
Matthieu Herrmann, Geoffrey I. Webb
Summary: Dynamic Time Warping (DTW) is a time series distance measure that allows for non-linear alignments between sequences. To address the permissiveness issue of unconstrained DTW, constraints in the form of windows and weights have been introduced. However, these approaches have limitations, such as crude step functions and relative weights. In this paper, Amerced Dynamic Time Warping (ADTW) is proposed as a new variant that penalizes warping with a fixed additive cost, providing both constraints and intuitive outcomes.
PATTERN RECOGNITION
(2023)
Article
Computer Science, Interdisciplinary Applications
Yashpal Ramakrishnaiah, Nenad Macesic, Geoffrey I. Webb, Anton Y. Peleg, Sonika Tyagi
Summary: The adoption of electronic health records (EHRs) has created opportunities for predicting clinical outcomes and improving patient care. However, non-standardized data representations and anomalies present major challenges in digital health research. To address these challenges, we have developed EHR-QC, a tool with two modules: data standardization and preprocessing. We believe that the development and adoption of tools like EHR-QC are critical for advancing digital health.
JOURNAL OF BIOMEDICAL INFORMATICS
(2023)
Article
Computer Science, Artificial Intelligence
Huan Zhang, Liangxiao Jiang, Geoffrey I. Webb
Summary: Naive Bayes is a classical machine learning algorithm that often uses discretization to transform quantitative attributes into qualitative attributes. Non-Disjoint Discretization (NDD) is a novel method that forms overlapping intervals and always locates a value toward the middle of an interval. However, existing approaches to NDD fail to consider the effect of multiple occurrences of a single value. In this study, a new method called Rigorous Non-Disjoint Discretization (RNDD) is proposed to handle multiple occurrences of a single value in a systematic manner, and it outperforms NDD and other existing competitors.
PATTERN RECOGNITION
(2023)
Article
Biochemical Research Methods
Tong Pan, Chen Li, Yue Bi, Zhikang Wang, Robin B. Gasser, Anthony W. Purcell, Tatsuya Akutsu, Geoffrey Webb, Seiya Imoto, Jiangning Song
Summary: PFresGO is an attention-based deep-learning approach that incorporates hierarchical structures in Gene Ontology (GO) graphs and natural language processing algorithms for functional annotation of proteins, achieving superior performance compared to existing methods.
Proceedings Paper
Computer Science, Artificial Intelligence
Gautier Pialla, Hassan Ismail Fawaz, Maxime Devanne, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller, Christoph Bergmeir, Daniel Schmidt, Geoffrey Webb, Germain Forestier
Summary: Adversarial attacks pose a threat to deep neural networks, especially in the case of time series. Existing attacks for time series are few and often detectable. To address this issue, we propose a new attack method that generates smoother perturbations and improve model robustness through adversarial training.
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT I
(2022)
Meeting Abstract
Public, Environmental & Occupational Health
Monica Jung, Dickson Lukose, Suzanne Nielsen, J. Simon Bell, Geoffrey Webb, Jenni Ilomaki
PHARMACOEPIDEMIOLOGY AND DRUG SAFETY
(2022)