Article
Computer Science, Theory & Methods
Rico Krueger, Michel Bierlaire, Thomas Gasos, Prateek Bansal
Summary: This study analyzes two robust alternatives to the multinomial probit model and demonstrates their advantages through simulation and case studies.
STATISTICS AND COMPUTING
(2023)
Article
Statistics & Probability
Tsung- Lin, Wan-Lun Wang
Summary: This paper derives explicit expressions for the moments of truncated multivariate normal/independent distributions with supports confined within a hyper-rectangle. A Monte Carlo experiment is conducted to validate the proposed formulae for five selected members of the distributions.
JOURNAL OF MULTIVARIATE ANALYSIS
(2024)
Article
Chemistry, Analytical
Maxime Metz, Florent Abdelghafour, Jean-Michel Roger, Matthieu Lesnoff
Summary: The paper presents a novel robust PLSR algorithm, RoBoost-PLSR, inspired by boosting principles, which mitigates the impact of outliers during calibration. Compared to other algorithms, RoBoost-PLSR demonstrates resilience and good performance on multiple datasets.
ANALYTICA CHIMICA ACTA
(2021)
Article
Engineering, Industrial
Hamideh Iranmanesh, Mehdi Jabbari Nooghabi, Abbas Parchami
Summary: This article presents a robust yield test to investigate the performance of an industrial production process in the presence of outliers. A new robust estimator of S-pk is introduced to test the production yield for any normal distribution in the presence of various numbers of outliers. Moreover, a Monte Carlo simulation method is proposed for testing the production yield based on the yield index S-pk by normal data and how well it can be used for some non-normal data is discussed.
QUALITY ENGINEERING
(2023)
Article
Engineering, Multidisciplinary
Jose Ragot
Summary: Detecting and locating outliers in measurements used for monitoring systems is crucial. Redundant information is needed for this. Sometimes, a robust approach that minimizes the impact of outliers is preferred.
Article
Computer Science, Interdisciplinary Applications
Mehrdad Naderi, Elham Mirfarah, Wan-Lun Wang, Tsung- Lin
Summary: Mixture regression models (MRMs) are widely used to capture the heterogeneity of relationships between the response variable and predictors from non-homogeneous groups. However, conventional MRMs are sensitive to departures from normality. A unified approach using normal mean-variance mixture (NMVM) distributions is proposed to robustify MRMs. An ECME algorithm is developed for ML estimation, and simulation studies demonstrate the finite-sample properties and robustness of the proposed model. Real data applications further illustrate the usefulness and superiority of the methodology.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2023)
Article
Engineering, Mechanical
Xiaokai Wei, Jie Li, Debiao Zhang, Kaiqiang Feng
Summary: An improved factor graph method based on enhanced robustness is proposed to improve the navigation performance and robustness of an INS/GPS/OD integrated navigation system. By dynamically adjusting factor weights, the method achieved a significant increase in navigation accuracy and outperformed existing methods.
MECHANICAL SYSTEMS AND SIGNAL PROCESSING
(2021)
Article
Chemistry, Analytical
Yifan Wang, Guolin Yu, Jun Ma
Summary: In this paper, a novel robust loss function is designed and a new binary classification learning method is proposed to improve classification performance and robustness while reducing the influence of outliers on the model. The introduction of regularization terms realizes the principle of structural risk minimization, and a simple and efficient iterative algorithm is designed to solve the non-convex optimization problem.
Article
Management
Alexander D. Stead, Phill Wheat, William H. Greene
Summary: The robustness of efficiency scores in decision-making units is important in managerial or regulatory benchmarking. However, the robustness of maximum likelihood estimation of stochastic frontier models has not been thoroughly explored. This study examines the influence function of the estimator in a stochastic frontier context and derives sufficient conditions for robust maximum likelihood estimation based on the properties of error component distributions and copula density. It is found that the canonical distributional assumptions do not satisfy these conditions. The Student's t noise distribution shows attractive properties and can be paired with a broad range of inefficiency distributions while satisfying the conditions under independence. The parameter estimates and efficiency predictions from robust specifications are less sensitive to contaminating observations compared to non-robust specifications.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
(2023)
Article
Computer Science, Artificial Intelligence
Matteo Zecchin, Sangwoo Park, Osvaldo Simeone, Marios Kountouris, David Gesbert
Summary: Standard Bayesian learning is suboptimal in generalization under misspecification and outliers. PAC-Bayes theory shows that the free energy criterion of Bayesian learning bounds the generalization error for Gibbs predictors under uncontaminated sampling distributions. This justifies the limitations of Bayesian learning in misspecified models and outliers. Recent work introduces PAC(m) bounds to enhance performance under misspecification, and this work proposes a robust free energy criterion combining the generalized logarithm score function with PAC(m) ensemble bounds, counteracting the effects of misspecification and outliers.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2023)
Article
Mathematical & Computational Biology
John T. Gregg, Jason H. Moore
Summary: Currently, there are no univariate outlier detection algorithms that can accurately remove univariate outliers from arbitrarily shaped distributions. To address this, we have developed a new algorithm called STAR_outliers, which effectively removes outliers from distributions with various shape profiles. Our results demonstrate that STAR_outliers outperforms other commonly used methods in terms of precision and recall for removing simulated outliers, as well as accurately modeling outlier bounds in real data distributions.
Article
Computer Science, Artificial Intelligence
Abdul Wahid, Dost Muhammad Khan, Ijaz Hussain, Sajjad Ahmad Khan, Zardad Khan
Summary: A novel robust unsupervised feature selection method, UFS-RDR, is proposed to improve feature selection performance by minimizing the graph regularized weighted data reconstruction error function, using Mahalanobis distance to detect outliers and determine Huber-type weight function. The experimental results show that UFS-RDR outperforms non-robust methods in the presence of contamination in unlabeled data.
EXPERT SYSTEMS WITH APPLICATIONS
(2022)
Article
Energy & Fuels
Hua Kuang, Risheng Qin, Mi He, Xin He, Ruimin Duan, Cheng Guo, Xian Meng
Summary: This paper investigates an outliers cleaning method for measurement data in distribution network, which utilizes association rules and various techniques to detect outliers and calculate repairing costs, achieving precise cleaning of outliers. The superiority of the proposed method is verified through test results on simulated datasets.
FRONTIERS IN ENERGY RESEARCH
(2021)
Article
Physics, Multidisciplinary
Aurea Grane, Giancarlo Manzi, Silvia Salini
Summary: This study proposes a new protocol that combines robust distances and visualization techniques for dynamic mixed data. Several graphical tools are introduced to monitor the evolution of distances and detect outliers. The methodology is illustrated on a real COVID-19 dataset.
Article
Statistics & Probability
John B. Holmes, Matthew R. Schofield
Summary: This research investigates the logit transformation of a normal random variable and proposes methods for constructing positive integer moments using recurrence relations and infinite sums of hyperbolic, exponential, and trigonometric functions. The study also determines criteria for truncating these infinite sums, improving computational efficiency for estimating logit-normal moments. Moreover, it reveals the exact analytic functions of negative moments and establishes a relationship between log-normal and logit-normal moments for different values of the parameter mu, providing an exact expression for the first moment when mu is an integer.
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
(2022)
Article
Biochemical Research Methods
Xuan Li, Yuejiao Fu, Xiaogang Wang, Weiliang Qiu
BMC BIOINFORMATICS
(2018)
Article
Physics, Multidisciplinary
Yuejiao Fu, Hangjing Wang, Augustine Wong
Article
Biochemistry & Molecular Biology
Xuan Li, Yuejiao Fu, Xiaogang Wang, Dawn L. DeMeo, Kelan Tantisira, Scott T. Weiss, Weiliang Qiu
INTERNATIONAL JOURNAL OF GENOMICS
(2018)
Article
Computer Science, Interdisciplinary Applications
Yin Cui, Yuejiao Fu, Abdulkadir Hussein
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2009)
Article
Statistics & Probability
Jiahua Chen, Pengfei Li, Yuejiao Fu
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2012)
Article
Multidisciplinary Sciences
Xuan Li, Weiliang Qiu, Jarrett Morrow, Dawn L. Demeo, Scott T. Weiss, Yuejiao Fu, Xiaogang Wang
Article
Statistics & Probability
Xiaoqing Niu, Pengfei Li, Yuejiao Fu
JOURNAL OF APPLIED STATISTICS
(2019)
Article
Statistics & Probability
Bin Sun, Yuehua Wu, Wenzhi Yang, Yuejiao Fu
COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
(2020)
Article
Statistics & Probability
Guanfu Liu, Yuejiao Fu, Jianjun Zhang, Xiaolong Pu, Boying Wang
JOURNAL OF APPLIED STATISTICS
(2020)
Article
Statistics & Probability
Guanfu Liu, Yuejiao Fu, Wenchen Liu, Rongji Mu
Summary: Contaminated mixture models (CMMs) have wide applications in the real world. An EM-test has been developed for testing homogeneity in CMMs, with demonstrated excellent finite-sample performance through simulation studies. Two real-data examples illustrate the applications of the proposed method.
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE
(2022)
Article
Physics, Multidisciplinary
Xiaoping Shi, Yue Zhang, Yuejiao Fu
Summary: This paper focuses on the homogeneity test for evaluating whether two multivariate samples come from the same distribution. Various methods have been proposed in the literature, but they may not be very powerful. Based on data depth, two new test statistics are proposed for the multivariate two-sample homogeneity test, which have the same chi(2)(1) asymptotic null distribution. The generalization of these tests into the multivariate multisample situation is also discussed. Simulation studies demonstrate the superior performance of the proposed tests, and the test procedure is illustrated through two real data examples.
Article
Statistics & Probability
Yuejiao Fu, Yukun Liu, Hsiao-Hsuan Wang, Xiaogang Wang
STATISTICAL THEORY AND RELATED FIELDS
(2020)
Article
Statistics & Probability
Yukun Liu, Pengfei Li, Yuejiao Fu
JOURNAL OF PROBABILITY AND STATISTICS
(2012)
Article
Oncology
Naomi A. Miller, Judith-Anne W. Chapman, Jin Qian, William A. Christens-Barry, Yuejiao Fu, Yan Yuan, H. Lavina A. Lickley, David E. Axelrod
CANCER INFORMATICS
(2010)