☆ 4.6 Article

Composite Likelihood Bayesian Information Criteria for Model Selection in High-Dimensional Data

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2010)

期刊

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION

卷 105, 期 492, 页码 1531-1540

出版社

TAYLOR & FRANCIS INC

DOI: 10.1198/jasa.2010.tm09414

关键词

Consistency; Model selection; Pseudo-likelihood; Variable selection

类别

Statistics & Probability

资金

Natural Science and Engineering Research Council of Canada
NSF

向作者/读者索取更多资源

Protocol

Reagent

摘要

For high-dimensional data sets with complicated dependency structures, the full likelihood approach often leads to intractable computational complexity. This imposes difficulty on model selection, given that most traditionally used information criteria require evaluation of the full likelihood. We propose a composite likelihood version of the Bayes information criterion (BIC) and establish its consistency property for the selection of the true underlying marginal model. Our proposed BIC is shown to be selection-consistent under some mild regularity conditions, where the number of potential model parameters is allowed to increase to infinity at a certain rate of the sample size. Simulation studies demonstrate the empirical performance of this new BIC, especially for the scenario where the number of parameters increases with sample size. Technical proofs of our theoretical results are provided in the online supplemental materials.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Statistics & Probability

Univariate measurement error selection likelihood for variable selection of additive model

Xiaoyu Ma, Lu Lin

Summary: The paper introduces a measurement error selection likelihood to simultaneously select important variables and estimate additive components in a high-dimensional additive model. The proposed estimation, although involving multiple variables, is a univariate nonparametric form. The study demonstrates valid variable selection with selection consistency and finite performances through Monte Carlo experiments.

STATISTICS (2021)

添加到收藏夹

Article Statistics & Probability

Fast Network Community Detection With Profile-Pseudo Likelihood Methods

Jiangzhou Wang, Jingfei Zhang, Binghui Liu, Ji Zhu, Jianhua Guo

Summary: The stochastic block model is a widely studied network model for community detection, but fitting its likelihood function on large-scale networks is challenging. In this article, the authors propose a novel likelihood based approach that decouples row and column labels, enabling fast alternating maximization. The method is computationally efficient and has provable convergence guarantee, and it is shown to provide consistent estimates of communities in a stochastic block model.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2023)

添加到收藏夹

Article Mathematics

Variable Selection for Spatial Logistic Autoregressive Models

Jiaxuan Liang, Yi Cheng, Yuqi Su, Shuyue Xiao, Yunquan Song

Summary: When the spatial response variables are discrete, the spatial logistic autoregressive model improves classification accuracy by adding an additional network structure to the ordinary logistic regression model. Sparse spatial logistic regression models have attracted significant attention due to the emergence of high-dimensional data in various fields. In this paper, a variable selection method is proposed for the high-dimensional spatial logistic autoregressive model. The penalized likelihood function is efficiently solved using an algorithm to identify important variables and make predictions. Simulations and a real example demonstrate the good performance of the proposed methods in a limited sample size.

MATHEMATICS (2022)

添加到收藏夹

Article Statistics & Probability

STRONG SELECTION CONSISTENCY OF BAYESIAN VECTOR AUTOREGRESSIVE MODELS BASED ON A PSEUDO-LIKELIHOOD APPROACH

Satyajit Ghosh, Kshitij Khare, George Michailidis

Summary: This study introduces a pseudo-likelihood based Bayesian approach for consistent variable selection in high-dimensional VAR models, and demonstrates strong selection consistency of the proposed method. The research has significant implications for understanding the behavior of VAR models in high-dimensional settings.

ANNALS OF STATISTICS (2021)

添加到收藏夹

Article Statistics & Probability

A ROBUST CONSISTENT INFORMATION CRITERION FOR MODEL SELECTION BASED ON EMPIRICAL LIKELIHOOD

Chixiang Chen, Ming Wang, Rongling Wu, Runze Li

Summary: Conventional likelihood-based criteria for model selection are limited when it comes to specifying distribution for complex data. A data-driven model-selection criterion based on empirical likelihood function is proposed, with robust plug-in estimators allowing for versatile applications and outperforming traditional criteria. The consistent model-selection property under a general context is established, with extensive simulation studies confirming its superiority.

STATISTICA SINICA (2022)

添加到收藏夹

Article Statistics & Probability

GEE-Assisted Variable Selection for Latent Variable Models with Multivariate Binary Data

Francis K. C. Hui, Samuel Muller, A. H. Welsh

Summary: In this article, the connections between the parameters from conditional and marginal models for multivariate binary responses are studied. GEE-assisted variable selection is proposed for fast variable selection in GLLVMs, and simulation studies demonstrate its strong finite sample performance and computational efficiency.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2023)

添加到收藏夹

Article Physics, Multidisciplinary

Variable Selection of Spatial Logistic Autoregressive Model with Linear Constraints

Yunquan Song, Yuqi Su, Zhijian Wang

Summary: The paper proposes a variable selection method with linear constraints for high-dimensional spatial logistic autoregressive models in order to integrate prior information. Monte Carlo experiments show that the method performs well under finite samples.

ENTROPY (2022)

添加到收藏夹

Article Physics, Fluids & Plasmas

Multivariate binary probability distribution in the Grassmann formalism

Takashi Arai

Summary: The proposed probability distribution for multivariate binary random variables is represented by principal minors of the parameter matrix, similar to the inverse covariance matrix in the multivariate Gaussian distribution. The model allows for analytical expressions of the partition function, central moments, and marginal and conditional distributions. Additionally, the proposed distribution can be obtained using Grassmann numbers, with the inverse matrix representing partial correlation.

PHYSICAL REVIEW E (2021)

添加到收藏夹

Article Statistics & Probability

Variable Selection for High-Dimensional Nodal Attributes in Social Networks with Degree Heterogeneity

Jia Wang, Xizhen Cai, Xiaoyue Niu, Runze Li

Summary: This article introduces a class of network models where the likelihood of connection is influenced by high-dimensional nodal covariates and node-specific popularity. A Bayesian method is proposed for feature selection, with implementation via Gibbs sampling. To address computational challenges in large sparse networks, a working model is developed for parameter updates based on dense sub-graphs. Model selection consistency is proven for both models, even when dimension grows exponentially. Monte Carlo studies and real world examples illustrate the performance of the proposed models and estimation procedures. Supplementary materials are available online.

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION (2023)

添加到收藏夹

Article Mathematics

Variable Selection for Length-Biased and Interval-Censored Failure Time Data

Fan Feng, Guanghui Cheng, Jianguo Sun

Summary: In this paper, a penalized variable selection technique based on Cox's proportional hazards model is developed for length-biased data with interval censoring. The proposed method is applied to real data and outperforms traditional variable selection methods based on conditional likelihood.

MATHEMATICS (2023)

添加到收藏夹

Article Mathematical & Computational Biology

Inconsistency identification in network meta-analysis via stochastic search variable selection

Georgios Seitidis, Stavros Nikolakopoulos, Ioannis Ntzoufras, Dimitris Mavridis

Summary: The reliability of network meta-analysis (NMA) results depends on the plausibility of the transitivity assumption, which assumes that the distribution of effect modifiers is similar across treatment comparisons. Different methods have been proposed to evaluate consistency, and our method, stochastic search inconsistency factor selection (SSIFS), uses variable selection techniques to determine the inclusion of inconsistency factors in the model. Our approach quantifies the posterior inclusion probability of each inconsistency factor and incorporates differences between direct and indirect evidence. We also construct an informative prior based on historical data from 201 published network meta-analyses.

STATISTICS IN MEDICINE (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A comparative study of different variable selection methods based on numerical simulation and empirical analysis

Dake Hou, Wenli Zhou, Qiuxia Zhang, Kun Zhang, Jiaqi Fang

Summary: This study uses computer science and statistical principles to evaluate the effectiveness of the linear random effect model, employing Lasso variable selection techniques. It assesses the model's consistency, prediction accuracy, stability, and efficiency through numerical simulation and empirical research. A novel approach is used to assess variable selection consistency, employing the angle between the actual coefficient vector β and the estimated coefficient vector β. The study also utilizes statistical tools, such as the boxplot, to visually represent prediction accuracy and variable selection consistency. The proposed method is compared to commonly used analysis methods, demonstrating its effectiveness and convenience in analyzing model stability and efficiency.

PEERJ COMPUTER SCIENCE (2023)

添加到收藏夹

Article Statistics & Probability

Variable selection with ABC Bayesian forests

Yi Liu, Veronika Rockova, Yuexi Wang

Summary: The study abandons the linear model framework and turns to tree-based methods for variable selection, proposing a Bayesian tree-based probabilistic method that shows consistency under certain conditions. Additionally, a new ABC sampling method based on data-splitting is introduced to achieve higher acceptance rates, successfully identifying variables with high marginal inclusion probabilities. This research provides a new avenue towards approximating the median probability model in non-parametric setups where the marginal likelihood is intractable.

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Trimming stability selection increases variable selection robustness

Tino Werner

Summary: Contamination can distort estimators, but robustness can address this issue. However, there is little discussion on the relationship between contamination and distorted variable selection in the literature. Many methods for sparse model selection, such as Stability Selection, have been proposed. We introduce the variable selection breakdown point to measure the number of contaminated cases or cells required to detect no relevant variables. By combining the variable selection breakdown point with resampling, we quantify the robustness of Stability Selection. Our trimmed Stability Selection method aggregates only the models with the best performance, reducing the impact of heavily contaminated resamples.

MACHINE LEARNING (2023)

添加到收藏夹

Article Computer Science, Theory & Methods

A parallel algorithm for ridge-penalized estimation of the multivariate exponential family from data of mixed types

Diederik S. Laman Trip, Wessel N. van Wieringen

Summary: Efficient computation of penalized estimators for multivariate exponential family distributions, including Markov random fields with mixed variable types, is studied. The model parameter is estimated through maximizing pseudo-likelihood with a convex penalty, leading to a consistent estimator. A computationally efficient parallel Newton-Raphson algorithm is introduced for numerical evaluation of the estimator, with considerations for convergence.

STATISTICS AND COMPUTING (2021)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.