Article
Mathematics
A. Pekgor
Summary: Recently, new goodness-of-fit tests based on Kullback-Leibler divergence and likelihood ratio have been introduced for the Cauchy distribution, claiming to be more powerful than traditional tests. This study proposes a novel test for the Cauchy distribution and derives its asymptotic null distribution. Critical values are determined through Monte Carlo simulation for various sample sizes, and power analysis reveals the superiority of the proposed test under certain conditions.
JOURNAL OF MATHEMATICS
(2023)
Article
Statistics & Probability
Adel Javanmard, Mohammad Mehrabi
Summary: This paper discusses the fundamental problem of assessing the goodness-of-fit for a general binary classifier and proposes a novel test method called GRASP. The method is applicable in finite sample settings and is not restricted by the distribution of features. Additionally, an improved method called model-X GRASP is proposed for situations where the joint distribution of the features vector is known, which can achieve better power.
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
(2023)
Article
Biochemical Research Methods
Mengqi Zhang, Sahar Gelfman, Cristiane Araujo Martins Moreno, Janice M. McCarthy, Matthew B. Harms, David B. Goldstein, Andrew S. Allen
Summary: Gene set-based signal detection analysis is used to detect the association between a trait and a set of genes by accumulating signals across the genes in the set. This study presents a flexible framework based on tail-focused GOF statistics, which depends on two critical parameters. Guidance on statistic selection is provided and the methods are implemented in the user-friendly R package wHC. The methods are applied to a study on amyotrophic lateral sclerosis.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Computer Science, Information Systems
Alexander Shapiro, Yao Xie, Rui Zhang
Summary: The research develops a general theory for the goodness-of-fit test to non-linear models, where the residual of the model fit follows a chi(2) distribution related to the model order and problem dimension. A sequential method for selecting model orders is presented, demonstrating broad applications in machine learning and signal processing.
IEEE TRANSACTIONS ON INFORMATION THEORY
(2021)
Article
Computer Science, Interdisciplinary Applications
Alberto Fernandez-de-Marcos, Eduardo Garcia-Portugues
Summary: Obtaining exact null distributions for goodness-of-fit test statistics is difficult, so practitioners often rely on asymptotic null distributions or Monte Carlo methods. This study presents improved methods for stabilizing the exact critical values and obtaining exact p-values for various classic and novel test statistics used for goodness-of-fit testing. These methods have been applied and shown to be effective in analyzing small-to-moderate sequentially-measured samples in astronomy.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2023)
Article
Multidisciplinary Sciences
Subhra Sankar Dhar, Shalabh
Summary: This article investigates the impact of the BCG vaccine on COVID-19 mortality and proposes a measure of goodness of fit to understand the relationship between LTBI and COVID-19 mortality.
SCIENTIFIC REPORTS
(2022)
Article
Computer Science, Interdisciplinary Applications
Dimitrios Bagkavos, Prakash N. Patil
Summary: A novel goodness-of-fit test is introduced to assess the validity of maximum likelihood estimates of normal mixture densities with known number of components. The authors provide theoretical quantification of the test statistic's size and power functions and derive a closed-form bandwidth rule and a cut-off point suitable for finite sample implementations. Extensive simulation study and analysis of real-world datasets demonstrate the superiority of this new test in all considered examples.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2023)
Article
Computer Science, Interdisciplinary Applications
Chihiro Watanabe, Taiji Suzuki
Summary: This study developed a new goodness-of-fit test for latent block models to test whether an observed data matrix fits a given set of row and column cluster numbers, or it consists of more clusters in at least one direction of the row and the column.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2021)
Article
Engineering, Environmental
Patricia de Souza Medeiros Pina Ximenes, Antonio Samuel Alves da Silva, Fahim Ashkar, Tatijana Stosic
Summary: Analysis of precipitation data is crucial for strategic planning and drought preparedness. The study found that the GAM and WEI models provided the best overall fit, while the LNORM and GP models showed the best fit in certain months and regions with different average precipitation levels.
WATER SCIENCE AND TECHNOLOGY
(2021)
Article
Statistics & Probability
Sara Algeri, Xiangyu Zhang
Summary: The classical tests of goodness of fit aim to validate a postulated model's conformity to data, but they may not explore deviations from the truth or ways to improve rejected models. This work focuses on establishing a comprehensive framework that integrates modeling, estimation, inference, and graphics to better fit data. Smooth tests, a smoothed bootstrap, and a graphical tool called CD-plot are utilized for modeling, estimation, inference, and post-selection adjustments.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2022)
Article
Engineering, Multidisciplinary
Akhtar Ali, Majid Hussain, Abdul Ghaffar, Zafar Ali, Kottakkaran Sooppy Nisar, M. R. Alharthi, Wasim Jamshed
Summary: Tumor growth models are essential tools in cancer therapy, and mathematical modeling is critical for describing tumor growth dynamics. This study focuses on a mathematical model involving partial differential equations to analyze the dynamics of tumor growth. Simulation results suggest that the model is an authentic tool for understanding tumor behavior.
ALEXANDRIA ENGINEERING JOURNAL
(2021)
Article
Psychology, Multidisciplinary
Changmin Yoo
Summary: Different cohorts of Korean adolescents showed varying trajectories of smartphone dependency, with recent cohorts exhibiting higher dependency at an earlier age compared to past cohorts.
CURRENT PSYCHOLOGY
(2023)
Article
Statistics & Probability
Jiawei Zhang, Jie Ding, Yuhong Yang
Summary: The article proposes a methodology called BAGofT for assessing the goodness of fit of a general classification procedure. The method splits the data into a training set and a validation set, and identifies the most severe regions of underfitting by adaptingively grouping the training set. A test statistic is then calculated based on this grouping and a comparison between the estimated success probabilities and the actual observed responses from the validation set. The BAGofT has a broader scope than existing methods in testing parametric classification models.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Review
Psychology, Multidisciplinary
Andreas Gegenfurtner
Summary: This meta-analytic review aimed to estimate the differences in model fit between bifactor exploratory structural equation modeling (B-ESEM) and other models. By analyzing 158 studies, it was found that B-ESEM model fit was superior to reference models. The results also indicated that model fit is sensitive to sample size, item number, and the number of specific and general factors in a model.
FRONTIERS IN PSYCHOLOGY
(2022)
Article
Automation & Control Systems
Krishnakumar Balasubramanian, Tong Li, Ming Yuan
Summary: The study examines the statistical performance of reproducing kernel Hilbert space (RKHS) embedding in testing problems, showing that a basic version of kernel embedding test may not be optimal, especially when chi(2) distance is used as the separation metric. The authors propose a simple modification to address this issue, demonstrating that the moderated approach offers optimal tests for various deviations from null hypotheses and can adapt over multiple interpolation spaces. Numerical experiments support the effectiveness of the approach.
JOURNAL OF MACHINE LEARNING RESEARCH
(2021)
Article
Computer Science, Interdisciplinary Applications
Blair Robertson, Chris Price
Summary: Spatial sampling designs are crucial for accurate estimation of population parameters. This study proposes a new design method that generates samples with good spatial spread and performs favorably compared to existing designs.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Hiroya Yamazoe, Kanta Naito
Summary: This paper focuses on the simultaneous confidence region of a one-dimensional curve embedded in multi-dimensional space. An estimator of the curve is obtained through local linear regression on each variable in multi-dimensional data. A method to construct a simultaneous confidence region based on this estimator is proposed, and theoretical results for the estimator and the region are developed. The effectiveness of the region is demonstrated through simulation studies and applications to artificial and real datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Cheng Peng, Drew P. Kouri, Stan Uryasev
Summary: This paper introduces a novel optimal experimental design method for quantifying the distribution tails of uncertain system responses. The method minimizes the variance or conditional value-at-risk of the upper bound of the predicted quantile, and estimates the data uncertainty using quantile regression. The optimal design problems are solved as linear programming problems, making the proposed methods efficient even for large datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xiaofei Wu, Hao Ming, Zhimin Zhang, Zhenyu Cui
Summary: This paper proposes a model that combines quantile regression and fused LASSO penalty, and introduces an iterative algorithm based on ADMM to solve high-dimensional datasets. The paper proves the global convergence and comparable convergence rates of the algorithm, and analyzes the theoretical properties of the model. Numerical experimental results support the superior performance of the model.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xin He, Xiaojun Mao, Zhonglei Wang
Summary: This paper proposes a nonparametric imputation method with sparsity to estimate the finite population mean, using an efficient kernel method and sparse learning for estimation. An augmented inverse probability weighting framework is adopted to achieve a central limit theorem for the proposed estimator under regularity conditions.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Christian H. Weiss, Fukang Zhu
Summary: This study introduces a multiplicative error model (CMEMs) for discrete-valued count time series, which is closely related to the integer-valued generalized autoregressive conditional heteroscedasticity (INGARCH) models. It derives the stochastic properties and estimation approaches of different types of INGARCH-CMEMs, and demonstrates their performance and application through simulations and real-world data examples.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Ming-Hung Kao, Ping-Han Huang
Summary: Optimal designs for sparse functional data under the functional empirical component (FEC) settings are investigated. New computational methods and theoretical results are developed to efficiently obtain optimal exact and approximate designs. A hybrid exact-approximate design approach is proposed and demonstrated to be efficient through simulation studies and a real example.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Mateus Maia, Keefe Murphy, Andrew C. Parnell
Summary: The Bayesian additive regression trees (BART) model is a powerful ensemble method for regression tasks, but its lack of smoothness and explicit covariance structure can limit its performance. The Gaussian processes Bayesian additive regression trees (GP-BART) model addresses this limitation by incorporating Gaussian process priors, resulting in superior performance in various scenarios.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xichen Mou, Dewei Wang
Summary: Human biomonitoring is a method of monitoring human health by measuring the accumulation of harmful chemicals in the body. To reduce the high cost of chemical analysis, researchers have adopted a cost-effective approach that combines specimens and analyzes the concentration of toxic substances in the pooled samples. To effectively interpret these aggregated measurements, a new regression framework is proposed by extending the additive partially linear model (APLM). The APLM is versatile in capturing the complex association between outcomes and covariates, making it valuable in assessing the complex interplay between chemical bioaccumulation and potential risk factors.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Lili Yu, Yichuan Zhao
Summary: The classical accelerated failure time model is a linear model commonly used for right censored survival data, but it cannot handle heteroscedastic survival data. This paper proposes a Laplace approximated quasi-likelihood method with a continuous estimating equation to address this issue, and provides estimation bias and confidence interval estimation formulas.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Shaobo Jin, Youngjo Lee
Summary: Hierarchical generalized linear models are widely used for fitting random effects models, but the standard error estimators receive less attention. Current standard error estimation methods are not necessarily accurate, and a sandwich estimator is proposed to improve the accuracy of standard error estimation.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Rebeca Pelaez, Ingrid Van Keilegom, Ricardo Cao, Juan M. Vilar
Summary: This article proposes an estimator for the probability of default (PD) in credit risk, derived from a nonparametric conditional survival function estimator based on cure models. The asymptotic expressions for bias, variance, and normality of the estimator are presented. Through simulation and empirical studies, the performance and practical behavior of the nonparametric estimator are compared with other methods.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
L. M. Andre, J. L. Wadsworth, A. O'Hagan
Summary: This paper proposes a dependence model that captures the entire data range in multi-variable cases. By blending two copulas with different characteristics and using a dynamic weighting function for smooth transition, the model is able to flexibly capture various dependence structures.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Niwen Zhou, Xu Guo, Lixing Zhu
Summary: The paper investigates hypothesis testing regarding the potential additional contributions of other covariates to the structural function, given the known covariates. The proposed distance-based test, based on Neyman's orthogonality condition, effectively detects local alternatives and is robust to the influence of nuisance functions. Numerical studies and real data analysis demonstrate the importance of this test in exploring covariates associated with AIDS treatment effects.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Blake Moya, Stephen G. Walker
Summary: A full posterior analysis method for nonparametric mixture models using Gibbs-type prior distributions, including the well known Dirichlet process mixture (DPM) model, is presented. The method removes the random mixing distribution and enables a simple-to-implement Markov chain Monte Carlo (MCMC) algorithm. The removal procedure reduces some of the posterior uncertainty and introduces a novel replacement approach. The method only requires the probabilities of a new or an old value associated with the corresponding Gibbs-type exchangeable sequence, without the need for explicit representations of the prior or posterior distributions. This allows the implementation of mixture models with full posterior uncertainty, including one introduced by Gnedin. The paper also provides numerous illustrations and introduces an R-package called CopRe that implements the methodology.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)