Article
Environmental Sciences
Jie Wu, Na Li
Summary: This study proposed using a truncated Gaussian mixture (TGM) distribution as an alternative model to estimate the wind speed probability distribution (WSPD). The parameters of the TGM models were determined using expectation-maximization algorithms, and the estimation accuracy was evaluated using the continuous ranked probability score loss theory. The proposed model was validated using real wind speed data and showed comparable accuracy to other models.
SCIENCE OF THE TOTAL ENVIRONMENT
(2023)
Article
Geochemistry & Geophysics
Jiahui Qu, Qian Du, Yunsong Li, Long Tian, Haoming Xia
Summary: This article proposes a novel Gaussian mixture model-based anomaly detection method for hyperspectral images, with main contributions being a new extraction approach for anomaly pixels and a weighting approach for fusing the results.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
(2021)
Article
Engineering, Mechanical
Ronny Francis Ribeiro Junior, Fabricio Alves de Almeida, Ariosto Bretanha Jorge, Joao Luiz Junho Pereira, Matheus Brendon Francisco, Guilherme Ferreira Gomes
Summary: Fault diagnosis is crucial for maintenance industries to prevent catastrophic failures and save time and money. This paper proposes a model using uniaxial acceleration signals to cluster, identify, and diagnose six different failures in electric motors. Experiment results demonstrate the efficiency of the proposed method, with an average accuracy of 97.9%, especially in identifying bearing, unbalanced, and mechanical loss failures. The method can be used for early detection of fault conditions based on real electric motor experiments.
JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING
(2023)
Article
Computer Science, Information Systems
Bin Yu, Yongzheng Zhang, Wenshu Xie, Wenjia Zuo, Yiming Zhao, Yuliang Wei
Summary: This paper introduces a statistical method using the Gaussian mixture model to detect network traffic anomalies, allowing us to learn the normal behavior of a communication process and predict whether a single communication is under attack.
Article
Geochemistry & Geophysics
Weiqiang Zhu, Ian W. McBrearty, S. Mostafa Mousavi, William L. Ellsworth, Gregory C. Beroza
Summary: Earthquake phase association algorithms play a crucial role in earthquake monitoring and research, but can be challenging for densely clustered earthquake sequences. This study presents a novel method, Gaussian Mixture Model Association (GaMMA), which effectively associates seismic phases and provides accurate estimates of earthquake location and magnitude.
JOURNAL OF GEOPHYSICAL RESEARCH-SOLID EARTH
(2022)
Article
Computer Science, Artificial Intelligence
Haosen Liu, Laquan Li, Jiangbo Lu, Shan Tan
Summary: This paper addresses the problem of prior learning in image processing, proposing a new prior model GSMM and an efficient image denoising framework. Experimental results show that the GSMM-based methods outperform other prior models and achieve faster speed.
IEEE TRANSACTIONS ON IMAGE PROCESSING
(2022)
Article
Neurosciences
Jingkun Wang, Kun Xiang, Kuo Chen, Rui Liu, Ruifeng Ni, Hao Zhu, Yan Xiong
Summary: This paper proposes a method for medical image registration based on the bounded generalized Gaussian mixture model, which is found to significantly outperform other conventional methods through extensive computer simulations.
FRONTIERS IN NEUROSCIENCE
(2022)
Article
Psychology, Mathematical
Jonas M. B. Haslbeck, Jeroen K. Vermunt, Lourens J. Waldorp
Summary: Gaussian mixture models (GMMs) are widely used for exploring heterogeneity in multivariate continuous data, but their performance in estimating GMMs for ordinal data is uncertain. In this study, we investigate this by simulating data from various GMMs, thresholding them in ordinal categories, and evaluating recovery performance. We find that the number of components can be reliably estimated with enough ordinal categories and variables, but the estimates of component model parameters are biased regardless of sample size.
BEHAVIOR RESEARCH METHODS
(2023)
Article
Computer Science, Artificial Intelligence
Tao Li, Jinwen Ma
Summary: This paper introduces a powerful model called the mixture of Gaussian processes (MGP). Conventional MGPs cannot effectively handle the case where the input variable lies on a general manifold or a graph. Based on the attention mechanism, the paper proposes two novel mixture models of Gaussian processes that overcome the limitations of conventional MGPs. Experimental results demonstrate the effectiveness of these methods.
PATTERN RECOGNITION LETTERS
(2022)
Article
Computer Science, Artificial Intelligence
Shuping Sun, Yaonan Tong, Biqiang Zhang, Bowen Yang, Long Yan, Peiguang He, Hong Xu
Summary: In this study, a modified incremental Gaussian mixture model (MIGMM) algorithm is proposed as an improvement of FIGMM, along with an adaptive methodology for removing spurious components in MIGMM. The contributions include a more simple and efficient prediction matrix update compared to FIGMM, and the use of an effective exponential model and logical matrix to remove spurious components. Experimental results demonstrate the robust performance of the proposed framework in comparison to other methods.
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
(2023)
Article
Engineering, Multidisciplinary
Xiaozhen Zhang, Jinsong Yang, Tiantian Wang, Jingsong Xie, Longzhen Tian
Summary: This paper presents a new fatigue crack size quantification method under a variable temperature environment based on a Gaussian mixture model (GMM). A series of damage indexes are proposed to characterize the interaction mechanism between the signal and crack under temperature change. The effectiveness of the method is validated through fatigue crack experiments in a variable temperature environment.
STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL
(2023)
Article
Computer Science, Interdisciplinary Applications
Shuping Sun, Yaonan Tong, Biqiang Zhang, Bowen Yang, Peiguang He, Wei Song, Wenbo Yang, Yilin Wu, Guangyu Liu
Summary: This study introduces a novel method for adaptively determining the optimal number of components (M) in a Gaussian mixture model when fitting a dataset, avoiding underfitting and overfitting.
JOURNAL OF COMPUTATIONAL SCIENCE
(2022)
Article
Computer Science, Artificial Intelligence
Marek Smieja, Maciej Wolczyk, Jacek Tabor, Bernhard C. Geiger
Summary: SeGMA is a semi-supervised generative model that efficiently learns the joint probability distribution of data and classes, achieving good generative performance and additional features such as interpolation, continuous style transfer, and intensity adjustment of class characteristics in data points.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2021)
Article
Statistics & Probability
Matthias Loeffler, Anderson Y. Zhang, Harrison H. Zhou
Summary: Spectral clustering is a popular algorithm for grouping high-dimensional data, easy to implement and computationally efficient. This paper demonstrates that spectral clustering is minimax optimal in the Gaussian mixture model under certain conditions, without the need for commonly assumed spectral gap conditions. However, its theoretical properties are not fully understood despite its successful applications in practice.
ANNALS OF STATISTICS
(2021)
Article
Environmental Sciences
Zhen Gao, Kun Fang, Zhipeng Wang, Kai Guo, Yuan Liu
Summary: This paper introduces an ionosphere-free (Ifree) filtering algorithm for ensuring the integrity of a ground-based augmentation system (GBAS). It proposes an overbounding framework based on a Gaussian mixture model (GMM) to handle the errors outputted by the Ifree algorithm. The performance of the algorithm is evaluated through Monte Carlo simulations and real-world road tests.
Article
Statistics & Probability
Mohadeseh Alsadat Farzammehr, Mohammad Reza Zadkarami, Geoffrey J. McLachlan
Summary: Traditional spatial panel data models typically assume a normal distribution for random error components, which may not be appropriate in many applications. A more flexible approach, the skew-normal generalized spatial panel data model, is proposed here, using a multivariate skew normal distribution for random error components. A Bayesian inference algorithm is developed for parameter estimation, and comparison with the traditional (normal) spatial model is conducted through simulation and analysis of real data on cigarette demand.
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION
(2021)
Article
Computer Science, Theory & Methods
Hien D. Nguyen, Florence Forbes, Geoffrey J. McLachlan
STATISTICS AND COMPUTING
(2020)
Article
Computer Science, Theory & Methods
Daniel Ahfock, Geoffrey J. McLachlan
STATISTICS AND COMPUTING
(2020)
Article
Statistics & Probability
Sharon X. Lee, Tsung- Lin, Geoffrey J. McLachlan
Summary: Mixtures of factor analyzers (MFA) are a powerful tool for modeling high-dimensional datasets, with recent generalizations allowing for skewness in the data. The proposed new model based on scale mixtures of canonical fundamental skew normal distributions can capture various types of skewness and asymmetry, accommodating multiple directions of skewness. Parameter estimation for this model can be carried out using maximum likelihood via an EM-type algorithm, and its usefulness and potential have been demonstrated using four real datasets.
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION
(2021)
Article
Statistics & Probability
Sharon X. Lee, Geoffrey J. McLachlan
Summary: In recent years, several mixtures of skew factor analyzers have been proposed with various skew distributions for either the factors or the errors. This paper examines the connections between these formulations and introduces a unified model that allows for skewness in both the factors and errors.
STATISTICS & PROBABILITY LETTERS
(2021)
Article
Statistics & Probability
Mohadeseh Alsadat Farzammehr, Mohsen Mohammadzadeh, Mohammad Reza Zadkarami, Geoffrey J. McLachlan
Summary: This research relaxes the normality assumption of a generalized linear mixed model by using an unrestricted multivariate skew-normal distribution. Parameter estimation is done through a Bayesian inference algorithm, and the proposed skew normal spatial mixed model is compared with the normal spatial mixed model through simulation studies and analysis of real data.
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
(2022)
Article
Computer Science, Interdisciplinary Applications
Daniel Ahfock, Geoffrey J. McLachlan
Summary: In supervised learning, manual labelling of training examples often introduces label noise, but logistic regression can be more robust to label errors when label noise is positively correlated with classification difficulty, improving classification accuracy.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2021)
Article
Computer Science, Theory & Methods
Daniel Ahfock, Saumyadipta Pyne, Geoffrey J. McLachlan
Summary: Data fusion involves integrating multiple related datasets. The statistical file-matching problem is a classic problem in multivariate analysis, and factor analysis models' low-rank structure can be used to estimate the full covariance matrix, providing better performance for file-matching problems.
STATISTICS AND COMPUTING
(2021)
Article
Computer Science, Artificial Intelligence
Sharon X. Lee, Geoffrey J. McLachlan, Kaleb L. Leemaqz
Summary: Finite mixture models are powerful tools for modeling and analyzing heterogeneous data, and recent trends show a shift towards using more flexible distributions. This paper presents a parallel implementation of the EM algorithm for these models, suitable for various processors and systems, with numerical experiments and comparisons across different platforms.
STATISTICAL ANALYSIS AND DATA MINING
(2021)
Article
Statistics & Probability
Mohsen Maleki, Geoffrey J. McLachlan, Sharon X. Lee
Summary: This paper introduces a flexible class of multivariate distributions called scale mixtures of fragmental normal (SMFN) distributions. It proposes an extension to the case of a finite mixture of SMFN (FM-SMFN) distributions. The SMFN family of distributions is convenient and effective for modeling data with skewness, discrepant observations, and population heterogeneity. It also possesses other desirable properties, such as an analytically tractable density and ease of computation for simulation and estimation of parameters.
STATISTICAL MODELLING
(2023)
Article
Statistics & Probability
TrungTin Nguyen, Faicel Chamroukhi, Hien D. Nguyen, Geoffrey J. McLachlan
Summary: The class of location-scale finite mixtures is of enduring interest in both applied and theoretical probability and statistics. The paper establishes and proves the following results: (a) location-scale mixtures of a continuous probability density function (PDF) can uniformly approximate any continuous PDF on a compact set with arbitrary accuracy; and (b) for any finite p >= 1, location-scale mixtures of an essentially bounded PDF can approximate any PDF in the L-p norm.
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
(2023)
Article
Computer Science, Interdisciplinary Applications
Daniel Ahfock, Saumyadipta Pyne, Geoffrey J. McLachlan
Summary: The statistical file-matching problem involves data integration with structured missing data, where imputation methods can be nonparametric or parametric. Game theory is used to study the identification problem and establish a general characterization of the minimax optimal strategy. Comparisons show that using the minimax optimal strategy for imputation can better preserve the joint distribution of variables compared to standard algorithms.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2022)
Article
Statistics & Probability
Sharon X. Lee, Geoffrey J. McLachlan
Summary: The literature on non-normal model-based clustering has been expanding in recent years. These models often use a mixture of component densities to provide flexibility in distributional shapes and handle skewness. Skewing is typically achieved by introducing latent variables or considering marginal transformations of the original variables.
JOURNAL OF MULTIVARIATE ANALYSIS
(2022)
Article
Statistics & Probability
Mohadeseh Alsadat Farzammehr, Geoffrey J. McLachlan
Summary: The distribution of observations in most econometric studies with spatial heterogeneity is skewed, and the normality assumption is not always appropriate. This study relaxes the normality assumption in spatial mixed models and allows for spatial heterogeneity. Bayesian mixed modeling with a multivariate skew-elliptical distribution is used for inference, and the proposed model is shown to be superior to conventional ones based on a simulation study and empirical evidence.
COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS
(2022)
Article
Statistics & Probability
Hien D. Nguyen, Daniel Fryer, Geoffrey J. McLachlan
Summary: The study addresses the problem of determining the number of mixture components in finite mixture models, proposing a method based on a sequential testing procedure. Through simulation studies and real data examples, the performance of the proposed method is demonstrated, providing practical recommendations for its application.
JOURNAL OF THE KOREAN STATISTICAL SOCIETY
(2022)