4.2 Article

Bayesian variable selection using an adaptive powered correlation prior

Journal

JOURNAL OF STATISTICAL PLANNING AND INFERENCE
Volume 139, Issue 8, Pages 2665-2674

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.jspi.2008.12.004

Keywords

Bayesian variable selection; Collinearity; Powered correlation prior; Zellner's g-prior

Funding

  1. NSF [DMS-0705968]

Ask authors/readers for more resources

The problem of selecting the correct Subset of predictors within a linear model has received much attention in recent literature. Within the Bayesian framework, a popular choice of prior has been Zellner's g-prior which is based oil the inverse of empirical covariance matrix of the predictors. An extension of the Zellner's prior is proposed in this article which allow for a power parameter oil the empirical covariance of the predictors. The power parameter helps control the degree to which correlated predictors are smoothed towards or away from one another. In addition, the empirical covariance of the predictors is used to obtain suitable priors over model space. In this manner, the power parameter also helps to determine whether models containing highly collinear predictors are preferred or avoided. The proposed power parameter can be chosen via an empirical Bayes method which leads to a data adaptive choice of prior. Simulation studies and a real data example are presented to show how the power parameter is well determined from the degree of cross-correlation within predictors. The proposed modification compares favorably to the standard use of Zellner's prior and an intrinsic prior in these examples. (C) 2009 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Statistics & Probability

Bayesian Regression Using a Prior on the Model Fit: The R2-D2 Shrinkage Prior

Yan Dora Zhang, Brian P. Naughton, Howard D. Bondell, Brian J. Reich

Summary: This article proposes a new class of shrinkage priors for high-dimensional linear regression through specifying a prior on the model fit and distributing it to the coefficients in a novel way. The proposed method outperforms previous approaches in concentration and tail behavior, leading to improved posterior contraction and empirical performance.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2022)

Article Environmental Sciences

Nonparametric conditional density estimation in a deep learning framework for short-term forecasting

David B. Huberman, Brian J. Reich, Howard D. Bondell

Summary: This paper introduces a conditional distribution estimation technique that combines machine learning algorithms to simultaneously estimate the entire conditional distribution and flexibly incorporate machine learning techniques, with the purpose of forecasting tropical cyclone intensity to provide additional insights and influence decision-making. Through simulation studies and real data validation, the effectiveness of the method is demonstrated, with further developments applicable to more complex forecasting and other applications.

ENVIRONMENTAL AND ECOLOGICAL STATISTICS (2022)

Article Statistics & Probability

Spatial Confounding in Generalized Estimating Equations

Francis K. C. Hui, Howard D. Bondell

Summary: Spatial confounding is a contentious research area in spatial statistics, primarily focused on spatial mixed models but also relevant in the context of generalized estimating equations (GEEs). To address spatial confounding, a restricted spatial working correlation matrix is proposed to estimate a partitioned covariate effect in GEEs.

AMERICAN STATISTICIAN (2022)

Article Mathematics, Applied

In search of peak human athletic potential: A mathematical investigation

Nick James, Max Menzies, Howard Bondell

Summary: This paper applies various methods to study the performance trends of elite athletes, revealing the Olympic effect, leveling off of athlete scores, similarities in performance trends between men and women's categories, and analyzing the geographic composition of top athletes.

CHAOS (2022)

Article Statistics & Probability

On robust probabilistic principal component analysis using multivariate t-distributions

Yiping Guo, Howard Bondell

Summary: This paper explores the application of multivariate t-distributions in probabilistic principal component analysis (PPCA) and provides a reexamination of some errors in the existing literature. Additionally, a new Monte Carlo expectation-maximization (MCEM) algorithm is introduced to implement a general type of such models.

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS (2023)

Article Materials Science, Textiles

Process-Structure-Property relationship of roping in meltblown nonwovens

Erin Roberts, Sujit Ghosh, Behnam Pourdeyhimi

Summary: This study developed a novel method to measure roping in meltblown nonwovens and analyzed its impact on pore size uniformity, filtration efficiency, and barrier properties. The study found that the interactions of capillary density with air flow and air flow with die-to-collector distance had the greatest influence on roping formation.

JOURNAL OF THE TEXTILE INSTITUTE (2023)

Article Statistics & Probability

Nonstationary Gaussian Process Discriminant Analysis With Variable Selection for High-Dimensional Functional Data

Weichang Yu, Sara Wade, Howard D. Bondell, Lamiae Azizi

Summary: High-dimensional classification and feature selection tasks are common with the advancement of data acquisition technology. In fields such as biology, genomics, and proteomics, where data are often functional and exhibit roughness and nonstationarity, traditional methods face additional challenges. In this work, we propose a novel approach called Gaussian process discriminant analysis (GPDA) that combines variable selection and classification in a unified framework. By utilizing sparse inverse covariance matrices and variational methods, our approach achieves scalable inference and demonstrates good performance.

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS (2023)

Article Economics

Generalizing the General: generalizing the CES production function to allow for the viability of input thresholds

Ali Zeytoon-Nejad, Barry K. Goodwin, Sujit Ghosh

Summary: This paper proposes a generalized variant of the CES production function that allows for the inclusion of minimum required levels of inputs. Empirical applications are provided for irrigation and nitrogen using experimental datasets and datasets generated through Monte-Carlo experiments.

APPLIED ECONOMICS (2023)

Article Computer Science, Interdisciplinary Applications

Shape-constrained estimation in functional regression with Bernstein polynomials

Rahul Ghosal, Sujit Ghosh, Jacek Urbanek, Jennifer A. Schrack, Vadim Zipunnikov

Summary: This study proposes a new estimation method for shape-constrained functional regression models using Bernstein polynomials. Theoretical results demonstrate the consistency of the constrained estimators, and numerical analysis shows improved efficiency and accuracy of the estimators under shape constraints.

COMPUTATIONAL STATISTICS & DATA ANALYSIS (2023)

Article Ecology

Unintended environmental benefits of crop insurance: Nitrogen and phosphorus in water bodies

Xun Lu, Yuyuan Che, Roderick M. Rejesus, Barry K. Goodwin, Sujit K. Ghosh, Jayash Paudel

Summary: Agricultural policies can indirectly impact the natural environment through their influence on farmer input behavior. This study examines the specific effects of crop insurance participation on nitrogen and phosphorus concentrations in waterways. The results suggest that higher crop insurance participation is associated with lower nitrogen concentrations but does not have a consistent effect on phosphorus concentrations.

ECOLOGICAL ECONOMICS (2023)

Article Statistics & Probability

Nonparametric estimation of isotropic covariance function

Yiming Wang, Sujit K. Ghosh

Summary: This paper proposes a nonparametric model using Bernstein polynomials to approximate arbitrary isotropic covariance functions. The popular L-alpha and L-2 norms are used to investigate the approximation properties. A computationally efficient sieve maximum likelihood (sML) estimation method is developed to estimate the unknown isotropic covariance function. Numerical results show that the proposed approach outperforms both parametric and nonparametric methods in terms of reducing bias and having lower norms.

JOURNAL OF NONPARAMETRIC STATISTICS (2023)

Article Statistics & Probability

Bayesian Analysis of First-Order Markov Models for Autocorrelated Binary Responses

Dasom Lee, Sujit Ghosh

Summary: In many clinical trials, binary-valued patient outcomes measured asynchronously over time across different dose levels are common. To address autocorrelation among these longitudinally observed outcomes, a first-order Markov model for binary data is developed. Nonhomogeneous models for transition probabilities are proposed to account for asynchronously observed time points, with B-spline basis functions used for modeling the transition probabilities. The model also allows estimation of any underlying non-decreasing curve based on suitable prior distributions, along with the incorporation of individual-specific random effects through a mixed effect model. Numerical comparisons with traditional models are conducted using simulated data sets, as well as practical applications using real data sets.

JOURNAL OF STATISTICAL THEORY AND PRACTICE (2023)

Article Astronomy & Astrophysics

Beyond Two-dimensional Mass-Radius Relationships: A Nonparametric and Probabilistic Framework for Characterizing Planetary Samples in Higher Dimensions

Shubham Kanodia, Matthias Y. He, Eric B. Ford, Sujit K. Ghosh, Angie Wolfgang

Summary: This work extends the existing nonparametric and probabilistic framework to simultaneously model distributions beyond two dimensions. The potential of this multidimensional approach is showcased in several science cases relating to planetary mass, radius, insolation, stellar mass, and dust mass measurements. Bootstrap and Monte Carlo sampling are employed to quantify the impact of finite sample size and measurement uncertainties. The open-source MRExo Python package is updated to incorporate these changes and provides users with a flexible framework for various data sets.

ASTROPHYSICAL JOURNAL (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression

Dongting Hu, Liuhua Peng, Tingjin Chu, Xiaoxing Zhang, Yinian Mao, Howard Bondell, Mingming Gong

Summary: This paper presents an uncertainty quantification method for supervised MDE models, capturing uncertainty through predictive variance and estimating error variance and estimation variance using constrained ordinal regression and bootstrapping methods. Experimental results demonstrate the accuracy and effectiveness of the proposed method.

COMPUTER VISION - ECCV 2022, PT II (2022)

Article Mathematics, Interdisciplinary Applications

Model Validation of a Single Degree-of-Freedom Oscillator: A Case Study

Edward Boone, Jan Hannig, Ryad Ghanam, Sujit Ghosh, Fabrizio Ruggeri, Serge Prudhomme

Summary: This paper investigates the validation process of a single degree-of-freedom oscillator to assess its predictive capabilities. Model validation is the process of determining the accuracy of a model in predicting observed physical events or system features. Virtual data is generated from a non-linear oscillator, and a mathematical model is derived by neglecting the non-linear term. Bayesian updating is used to identify model parameters, including calibration of the normal probability density function representing model error.

STATS (2022)

Article Statistics & Probability

Estimation for the Cox model with biased sampling data via risk set sampling

Omidali Aghababaei Jazi

Summary: In this paper, a pseudo-partial likelihood estimation method is proposed to estimate parameters in the Cox proportional hazards model with right-censored and biased sampling data by adjusting sample risk sets. The asymptotic properties of the resulting estimator are studied, and a simulation study is conducted to illustrate the finite sample performance. The proposed method is also applied to analyze a set of HIV/AIDS data.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Robust penalized empirical likelihood in high dimensional longitudinal data analysis

Liya Fu, Shuwen Hu, Jiaqi Li

Summary: Empirical likelihood (EL) is an effective nonparametric method that combines estimating equations flexibly and adaptively. A penalized EL method based on robust estimating functions is proposed for variable selection in a high-dimensional model, allowing the dimensions to grow exponentially with the sample size. The proposed method improves robustness and effectiveness in the presence of outliers or heavy-tailed data. Extensive simulation studies and a real data example demonstrate the enhanced variable selection accuracy when dealing with heavy-tailed data or outliers.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Subgroup analysis for the functional linear model

Yifan Sun, Ziyi Liu, Wu Wang

Summary: This paper extends the classical functional linear regression model to allow for heterogeneous coefficient functions among different subgroups of subjects. A penalization-based approach is proposed to simultaneously determine the number and structure of subgroups and coefficient functions within each subgroup. The paper provides an effective computational algorithm and establishes the oracle properties and estimation consistency of the model. Extensive numerical simulations demonstrate its superiority compared to competing methods, and an analysis of an air quality dataset leads to interesting findings and improved predictions.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

A pair of novel priors for improving and extending the conditional MLE

Takemi Yanagimoto, Yoichi Miyata

Summary: A Bayesian estimator is proposed to improve the conditional maximum likelihood estimation by introducing a pair of priors. The conditional maximum likelihood estimation is explained using the posterior mode under a prior, and a promising estimator is defined using the posterior mean under a corresponding prior. The advantages of this approach include two different optimality properties of the induced estimator, the ease of various extensions, and the possible treatments for a finite sample size. The existing approaches are discussed and critiqued.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Jackknife empirical likelihood confidence intervals for the categorical Gini correlation

Sameera Hewage, Yongli Sang

Summary: This paper introduces a new method for measuring dependence, the categorical Gini correlation rho(g), and proposes a Jackknife empirical likelihood approach for constructing confidence intervals. Simulation studies and real data applications demonstrate competitive performance of the proposed method in terms of coverage accuracy and interval length.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

A multidimensional objective prior distribution from a scoring rule

Isadora Antoniano-Villalobos, Cristiano Villa, Stephen G. Walker

Summary: Constructing objective priors for multidimensional parameter spaces is challenging, and a common approach assumes independence and uses standard objective methods to obtain marginal distributions. In this paper, a novel objective prior is proposed by extending the objective method for one-dimensional case, allowing for a dependence structure in multidimensional parameter spaces.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Construction of optimal supersaturated designs by the expansive replacement method

Hui Li, Liuqing Yang, Kashinath Chatterjee, Min-Qian Liu

Summary: Supersaturated design (SSD) plays a crucial role in factor screening, and E(f(NOD)) criterion is one of the most widely used criteria for evaluating multi-level and mixed-level SSDs. This paper provides methods to construct multi-level E(f(NOD)) optimal SSDs with general run sizes, which can also be extended to mixed-level SSDs. The main idea of these methods is to combine two processed generalized Hadamard matrices with the expansive replacement method. These proposed methods are easy to implement, and the non-orthogonality between any two columns of the resulting SSDs is well controlled by that of the source designs.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

A comparison of likelihood-based methods for size-biased sampling

Victoria L. Leaver, Robert G. Clark, Pavel N. Krivitsky, Carole L. Birrell

Summary: This article compares three likelihood approaches to estimation under informative sampling and examines their efficiency and asymptotic variance. The study shows that sample likelihood estimation approaches the efficiency of full maximum likelihood estimation when the sample size tends to infinity and the sampling fraction tends to zero. However, when the sample size tends to infinity and the sampling fraction is not negligible, maximum likelihood estimation is more efficient due to considering the possibility of duplicate samples. Pseudo-likelihood estimation can perform poorly in certain cases. For a special case where the superpopulation is exponential and the selection is probability proportional to size, the anticipated variance of pseudo-likelihood estimation is infinite.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Maximum likelihood estimation of the log-concave component in a semi-parametric mixture with a standard normal density

Fadoua Balabdaoui, Harald Besdziek

Summary: The two-component mixture model with known background density, unknown signal density, and unknown mixing proportion has been studied in this paper. The log-concave MLE of the signal density is computed using the estimator of Patra & Sen (2016), and its consistency and convergence are shown. The performance of this method is evaluated through a simulation study.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Time changes and stationarity issues for extended scalar autoregressive models

V. Girardin, R. Senoussi

Summary: This paper investigates different issues related to stationarity reduction in autoregressive models, including both continuous and discrete time cases. Necessary and sufficient conditions for autoregressive models to be weakly stationary are explored, with explicit formulas for the time changes. Furthermore, the issue of stationarity reduction for discrete sequences sampled from continuous time autoregressive processes is also considered.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Regression models for circular data based on nonnegative trigonometric sums

Juan Jose Fernandez-Duran, Maria Mercedes Gregorio-Dominguez

Summary: This paper presents the application of nonnegative trigonometric sums (NNTS) models in circular data analysis. Regression models for circular-dependent variables are constructed by fitting great circles on the parameter hypersphere, enabling the identification of different regions along the circle. The transformation of the original circular variable into a linear variable allows for the application of common linear regression methods in circular data analysis.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

Robust inference for subgroup analysis with general transformation models

Miao Han, Yuanyuan Lin, Wenxin Liu, Zhanfeng Wang

Summary: The article proposes a method based on maximum rank correlation and concave fusion to automatically determine the number of subgroups, identify subgroup structure, and estimate subgroup-specific covariate effects. The method can be used without prior grouping information and is applicable to handling censored data.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)

Article Statistics & Probability

A framework of zero-inflated bayesian negative binomial regression models for spatiotemporal data

Qing He, Hsin-Hsiung Huang

Summary: This article introduces a method for spatiotemporal data analysis with massive zeros, which is widely used in epidemiology and public health. The method fits zero-inflated negative binomial models using a Bayesian framework and employs latent variables from Polya-Gamma distributions to improve computational efficiency.

JOURNAL OF STATISTICAL PLANNING AND INFERENCE (2024)