Article
Biochemical Research Methods
Ruilin Li, Christopher Chang, Yosuke Tanigawa, Balasubramanian Narasimhan, Trevor Hastie, Robert Tibshirani, Manuel A. Rivas
Summary: This paper addresses the computational challenges posed by large-scale and high-dimensional genome sequencing data, and develops two efficient solvers for optimization problems in this context. By utilizing a two-bit representation for genetic matrices, the memory requirement is reduced and computational speed is improved. The proposed methods successfully solve Lasso, group Lasso, linear, logistic, and Cox regression problems on sparse genetic matrices within 10 minutes using less than 32GB of memory.
Article
Anthropology
Diana Karimova, Roger Th. A. J. Leenders, Marlyne Meijerink-Bosman, Joris Mulder
Summary: In recent years, relational event models have gained increasing interest for dynamic social network analysis. These models are based on the concept of an event, which is defined as a triplet of time, sender, and receiver of a social interaction. The goal of relational event models is to understand the factors driving the pattern of social interactions among actors. Researchers often include a large number of predictors in their studies, but this can lead to overfitting and complex models that are difficult to interpret. Bayesian regularization methods offer a potential solution by identifying the relevant effects and reducing the number of significant effects in the model. This paper proposes Bayesian regularization methods for relational event models and applies them to three empirical applications, showing that these methods can provide a more parsimonious description of social interaction behavior without sacrificing predictive performance.
Article
Immunology
Ferris A. Ramadan, Katherine D. Ellingson, Robert A. Canales, Edward J. Bedrick, John N. Galgiani, Fariba M. Donovan
Summary: Demographic and clinical indicators have been described to support identification of coccidioidomycosis, but their interplay has not been explored in a clinical setting. This study aimed to develop a predictive model for coccidioidomycosis among participants with suspected infection.
EMERGING INFECTIOUS DISEASES
(2022)
Article
Engineering, Mechanical
M. Aucejo, O. De Smet
Summary: Input estimation remains a significant issue in structural dynamics, with two main groups of inverse methods in time and frequency domains. This paper introduces a generalized multiplicative regularization for estimating mechanical loads on linear structures, demonstrating high solution accuracy through numerical and real-world applications. The extra tuning parameter in this approach plays a key role in enhancing results amidst measurement noise levels.
MECHANICAL SYSTEMS AND SIGNAL PROCESSING
(2021)
Article
Political Science
Jong Hee Park, Soichiro Yamauchi
Summary: This paper develops a general Bayesian method for change-point detection in high-dimensional data and applies it in the fixed-effect model. The proposed method can jointly estimate high-dimensional parameters and hidden change-points, successfully identifying temporal heterogeneity in regression model parameters.
POLITICAL ANALYSIS
(2023)
Article
Mathematics, Interdisciplinary Applications
Muhammad Shoaib, Aqsa Zafar Abbasi, Muhammad Asif Zahoor Raja, Kottakkaran Sooppy Nisar
Summary: Studying the dynamics of the transmission of an outbreak from prey to predatory is crucial and significant. This paper presents a novel implementation of intelligent computation using supervised machine learning to analyze a fractional epidemiological model. The effectiveness of the model is demonstrated through various evaluations.
CHAOS SOLITONS & FRACTALS
(2022)
Article
Biochemical Research Methods
Gianna Serafina Monti, Peter Filzmoser
Summary: High-throughput sequencing technologies provide a large amount of data for microbiome composition analysis, which requires consideration of data sparsity and uniqueness. This article proposes a regression variable selection method that takes into account the special nature of microbiome data, achieving sparsity and robustness in regression coefficient estimates through elastic-net regularization. The practical utility of the method is demonstrated through real-world application and simulation studies.
Article
Computer Science, Information Systems
Prabhishek Singh, Raj Shree, Manoj Diwakar
Summary: This paper introduces a new technique for despeckling SAR images using DWT and method noise thresholding, focusing on selecting decomposition levels based on entropy analysis and fusing high-frequency coefficients to improve image quality.
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
(2021)
Article
Environmental Sciences
Gregory L. Britten, Yara Mohajerani, Louis Primeau, Murat Aydin, Catherine Garcia, Wei-Lei Wang, Benoit Pasquier, B. B. Cael, Francois W. Primeau
Summary: Hierarchical Bayesian modeling is increasingly used in environmental science to describe statistical complexities in large compiled datasets, offering benefits such as flexibility, reduction of uncertainty, and incorporation of prior scientific information. Its versatility and feasibility for diverse environmental applications are highlighted, enhanced by recent developments in Markov Chain Monte Carlo algorithms and user-friendly software implementations.
FRONTIERS IN ENVIRONMENTAL SCIENCE
(2021)
Article
Statistics & Probability
Javier Enrique Aguilar, Paul -Christian Buerkner
Summary: Training high-dimensional regression models on sparse data with more parameters than observations is complex. Bayesian inference can be achieved using shrinkage priors, but existing ones do not handle multilevel structures. We propose the R2D2M2 prior, an extension of the R2D2 prior, for linear multilevel models. The proposed prior enables both local and global shrinkage, has interpretable hyperparameters, and allows evaluation and interpretation of shrinkage. Extensive experiments show its effectiveness for estimating complex Bayesian multilevel models.
ELECTRONIC JOURNAL OF STATISTICS
(2023)
Article
Computer Science, Interdisciplinary Applications
Nicola Donelli, Stefano Peluso, Antonietta Mira
Summary: Interactions among multiple time series of positive random variables play a crucial role in various financial applications, with the popular model being the vector Multiplicative Error Model (vMEM) that imposes a linear iterative structure on the dynamics of the conditional mean. A Bayesian semiparametric approach is used to address the restrictive assumption on the distribution of the random innovation term in vMEM, resulting in a more flexible specification. The method avoids computational complications by formulating a slice sampler on the parameter-extended unconstrained version of the model, and outperforms classical methods in terms of fitting and predictive power.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2021)
Article
Engineering, Multidisciplinary
Peiliang Xu
Summary: The algebraic fitting of circles was originally proposed by Delogne (1972) and Kasa (1976). In this study, we extend their work and introduce fast and nearly unbiased weighted least squares methods to best fit circles. Simulation results demonstrate that the two non-iterative bias-corrected weighted LS methods outperform the naive weighted LS method, ordinary LS-based methods, and gradient-based weighted LS method in terms of both biases and mean squared errors, regardless of strong or weak geometric constraints. However, a weak geometric constraint leads to poor circle fitting. To address this, we propose regularized variants of the bias-corrected weighted LS method to fit circles with weak geometric constraints. Simulations also reveal that these two non-iterative regularized variants achieve satisfactory circle fitting and consistently perform the best among all regularization methods for ill-conditioned circle fitting problems.
Article
Mathematics, Interdisciplinary Applications
Daniel R. Kowal
Summary: The study introduces a fully Bayesian framework for dynamic functional regression, which models the time-evolution of functional data using scalar predictors and captures within-curve dependence using unknown basis functions for dimension reduction and scalability. The methodology utilizes shrinkage priors to guard against overfitting and incorporates time-varying parameter regression to address the dynamics of the functional data. Posterior inference is made possible using a customized Gibbs sampler, showing exceptional forecasting accuracy and uncertainty quantification over four decades.
Article
Mathematics
Cassio Robertode Andrade Alves, Marcio Laurini
Summary: This paper introduces a Bayesian shrinkage approach for estimating the CAPM with a large number of instruments, accounting for measurement errors. The approach effectively counters bias escalation in the conventional two-stage least squares estimation. Empirical results show that the estimated CAPM beta values generated by our approach differ significantly from ordinary least squares and two-stage least squares approaches, with notable economic implications. Additionally, our approach substantially enhances the explanatory power of the CAPM framework when applied to average cross-sectional asset returns.
Article
Automation & Control Systems
Simone Formentin, Alessandro Chiuso
Summary: This paper presents a novel theoretical framework for control-oriented identification based on a Bayesian modeling perspective, emphasizing the incorporation of closed-loop specifications through suitable regularization. Additionally, a Bayesian robust control design approach is discussed, utilizing all information from the modeling procedure and demonstrating its effectiveness against state-of-the-art regularized identification in digital control system design.
Article
Computer Science, Interdisciplinary Applications
Blair Robertson, Chris Price
Summary: Spatial sampling designs are crucial for accurate estimation of population parameters. This study proposes a new design method that generates samples with good spatial spread and performs favorably compared to existing designs.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Hiroya Yamazoe, Kanta Naito
Summary: This paper focuses on the simultaneous confidence region of a one-dimensional curve embedded in multi-dimensional space. An estimator of the curve is obtained through local linear regression on each variable in multi-dimensional data. A method to construct a simultaneous confidence region based on this estimator is proposed, and theoretical results for the estimator and the region are developed. The effectiveness of the region is demonstrated through simulation studies and applications to artificial and real datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Cheng Peng, Drew P. Kouri, Stan Uryasev
Summary: This paper introduces a novel optimal experimental design method for quantifying the distribution tails of uncertain system responses. The method minimizes the variance or conditional value-at-risk of the upper bound of the predicted quantile, and estimates the data uncertainty using quantile regression. The optimal design problems are solved as linear programming problems, making the proposed methods efficient even for large datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xiaofei Wu, Hao Ming, Zhimin Zhang, Zhenyu Cui
Summary: This paper proposes a model that combines quantile regression and fused LASSO penalty, and introduces an iterative algorithm based on ADMM to solve high-dimensional datasets. The paper proves the global convergence and comparable convergence rates of the algorithm, and analyzes the theoretical properties of the model. Numerical experimental results support the superior performance of the model.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xin He, Xiaojun Mao, Zhonglei Wang
Summary: This paper proposes a nonparametric imputation method with sparsity to estimate the finite population mean, using an efficient kernel method and sparse learning for estimation. An augmented inverse probability weighting framework is adopted to achieve a central limit theorem for the proposed estimator under regularity conditions.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Christian H. Weiss, Fukang Zhu
Summary: This study introduces a multiplicative error model (CMEMs) for discrete-valued count time series, which is closely related to the integer-valued generalized autoregressive conditional heteroscedasticity (INGARCH) models. It derives the stochastic properties and estimation approaches of different types of INGARCH-CMEMs, and demonstrates their performance and application through simulations and real-world data examples.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Ming-Hung Kao, Ping-Han Huang
Summary: Optimal designs for sparse functional data under the functional empirical component (FEC) settings are investigated. New computational methods and theoretical results are developed to efficiently obtain optimal exact and approximate designs. A hybrid exact-approximate design approach is proposed and demonstrated to be efficient through simulation studies and a real example.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Mateus Maia, Keefe Murphy, Andrew C. Parnell
Summary: The Bayesian additive regression trees (BART) model is a powerful ensemble method for regression tasks, but its lack of smoothness and explicit covariance structure can limit its performance. The Gaussian processes Bayesian additive regression trees (GP-BART) model addresses this limitation by incorporating Gaussian process priors, resulting in superior performance in various scenarios.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xichen Mou, Dewei Wang
Summary: Human biomonitoring is a method of monitoring human health by measuring the accumulation of harmful chemicals in the body. To reduce the high cost of chemical analysis, researchers have adopted a cost-effective approach that combines specimens and analyzes the concentration of toxic substances in the pooled samples. To effectively interpret these aggregated measurements, a new regression framework is proposed by extending the additive partially linear model (APLM). The APLM is versatile in capturing the complex association between outcomes and covariates, making it valuable in assessing the complex interplay between chemical bioaccumulation and potential risk factors.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Lili Yu, Yichuan Zhao
Summary: The classical accelerated failure time model is a linear model commonly used for right censored survival data, but it cannot handle heteroscedastic survival data. This paper proposes a Laplace approximated quasi-likelihood method with a continuous estimating equation to address this issue, and provides estimation bias and confidence interval estimation formulas.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Shaobo Jin, Youngjo Lee
Summary: Hierarchical generalized linear models are widely used for fitting random effects models, but the standard error estimators receive less attention. Current standard error estimation methods are not necessarily accurate, and a sandwich estimator is proposed to improve the accuracy of standard error estimation.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Rebeca Pelaez, Ingrid Van Keilegom, Ricardo Cao, Juan M. Vilar
Summary: This article proposes an estimator for the probability of default (PD) in credit risk, derived from a nonparametric conditional survival function estimator based on cure models. The asymptotic expressions for bias, variance, and normality of the estimator are presented. Through simulation and empirical studies, the performance and practical behavior of the nonparametric estimator are compared with other methods.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
L. M. Andre, J. L. Wadsworth, A. O'Hagan
Summary: This paper proposes a dependence model that captures the entire data range in multi-variable cases. By blending two copulas with different characteristics and using a dynamic weighting function for smooth transition, the model is able to flexibly capture various dependence structures.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Niwen Zhou, Xu Guo, Lixing Zhu
Summary: The paper investigates hypothesis testing regarding the potential additional contributions of other covariates to the structural function, given the known covariates. The proposed distance-based test, based on Neyman's orthogonality condition, effectively detects local alternatives and is robust to the influence of nuisance functions. Numerical studies and real data analysis demonstrate the importance of this test in exploring covariates associated with AIDS treatment effects.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Blake Moya, Stephen G. Walker
Summary: A full posterior analysis method for nonparametric mixture models using Gibbs-type prior distributions, including the well known Dirichlet process mixture (DPM) model, is presented. The method removes the random mixing distribution and enables a simple-to-implement Markov chain Monte Carlo (MCMC) algorithm. The removal procedure reduces some of the posterior uncertainty and introduces a novel replacement approach. The method only requires the probabilities of a new or an old value associated with the corresponding Gibbs-type exchangeable sequence, without the need for explicit representations of the prior or posterior distributions. This allows the implementation of mixture models with full posterior uncertainty, including one introduced by Gnedin. The paper also provides numerous illustrations and introduces an R-package called CopRe that implements the methodology.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)