Article
Statistics & Probability
Xiaoyu Ma, Lu Lin
Summary: The paper introduces a measurement error selection likelihood to simultaneously select important variables and estimate additive components in a high-dimensional additive model. The proposed estimation, although involving multiple variables, is a univariate nonparametric form. The study demonstrates valid variable selection with selection consistency and finite performances through Monte Carlo experiments.
Article
Statistics & Probability
Jiangzhou Wang, Jingfei Zhang, Binghui Liu, Ji Zhu, Jianhua Guo
Summary: The stochastic block model is a widely studied network model for community detection, but fitting its likelihood function on large-scale networks is challenging. In this article, the authors propose a novel likelihood based approach that decouples row and column labels, enabling fast alternating maximization. The method is computationally efficient and has provable convergence guarantee, and it is shown to provide consistent estimates of communities in a stochastic block model.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Article
Mathematics
Jiaxuan Liang, Yi Cheng, Yuqi Su, Shuyue Xiao, Yunquan Song
Summary: When the spatial response variables are discrete, the spatial logistic autoregressive model improves classification accuracy by adding an additional network structure to the ordinary logistic regression model. Sparse spatial logistic regression models have attracted significant attention due to the emergence of high-dimensional data in various fields. In this paper, a variable selection method is proposed for the high-dimensional spatial logistic autoregressive model. The penalized likelihood function is efficiently solved using an algorithm to identify important variables and make predictions. Simulations and a real example demonstrate the good performance of the proposed methods in a limited sample size.
Article
Statistics & Probability
Satyajit Ghosh, Kshitij Khare, George Michailidis
Summary: This study introduces a pseudo-likelihood based Bayesian approach for consistent variable selection in high-dimensional VAR models, and demonstrates strong selection consistency of the proposed method. The research has significant implications for understanding the behavior of VAR models in high-dimensional settings.
ANNALS OF STATISTICS
(2021)
Article
Statistics & Probability
Chixiang Chen, Ming Wang, Rongling Wu, Runze Li
Summary: Conventional likelihood-based criteria for model selection are limited when it comes to specifying distribution for complex data. A data-driven model-selection criterion based on empirical likelihood function is proposed, with robust plug-in estimators allowing for versatile applications and outperforming traditional criteria. The consistent model-selection property under a general context is established, with extensive simulation studies confirming its superiority.
Article
Statistics & Probability
Francis K. C. Hui, Samuel Muller, A. H. Welsh
Summary: In this article, the connections between the parameters from conditional and marginal models for multivariate binary responses are studied. GEE-assisted variable selection is proposed for fast variable selection in GLLVMs, and simulation studies demonstrate its strong finite sample performance and computational efficiency.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Article
Physics, Multidisciplinary
Yunquan Song, Yuqi Su, Zhijian Wang
Summary: The paper proposes a variable selection method with linear constraints for high-dimensional spatial logistic autoregressive models in order to integrate prior information. Monte Carlo experiments show that the method performs well under finite samples.
Article
Physics, Fluids & Plasmas
Takashi Arai
Summary: The proposed probability distribution for multivariate binary random variables is represented by principal minors of the parameter matrix, similar to the inverse covariance matrix in the multivariate Gaussian distribution. The model allows for analytical expressions of the partition function, central moments, and marginal and conditional distributions. Additionally, the proposed distribution can be obtained using Grassmann numbers, with the inverse matrix representing partial correlation.
Article
Statistics & Probability
Jia Wang, Xizhen Cai, Xiaoyue Niu, Runze Li
Summary: This article introduces a class of network models where the likelihood of connection is influenced by high-dimensional nodal covariates and node-specific popularity. A Bayesian method is proposed for feature selection, with implementation via Gibbs sampling. To address computational challenges in large sparse networks, a working model is developed for parameter updates based on dense sub-graphs. Model selection consistency is proven for both models, even when dimension grows exponentially. Monte Carlo studies and real world examples illustrate the performance of the proposed models and estimation procedures. Supplementary materials are available online.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2023)
Article
Mathematics
Fan Feng, Guanghui Cheng, Jianguo Sun
Summary: In this paper, a penalized variable selection technique based on Cox's proportional hazards model is developed for length-biased data with interval censoring. The proposed method is applied to real data and outperforms traditional variable selection methods based on conditional likelihood.
Article
Mathematical & Computational Biology
Georgios Seitidis, Stavros Nikolakopoulos, Ioannis Ntzoufras, Dimitris Mavridis
Summary: The reliability of network meta-analysis (NMA) results depends on the plausibility of the transitivity assumption, which assumes that the distribution of effect modifiers is similar across treatment comparisons. Different methods have been proposed to evaluate consistency, and our method, stochastic search inconsistency factor selection (SSIFS), uses variable selection techniques to determine the inclusion of inconsistency factors in the model. Our approach quantifies the posterior inclusion probability of each inconsistency factor and incorporates differences between direct and indirect evidence. We also construct an informative prior based on historical data from 201 published network meta-analyses.
STATISTICS IN MEDICINE
(2023)
Article
Computer Science, Artificial Intelligence
Dake Hou, Wenli Zhou, Qiuxia Zhang, Kun Zhang, Jiaqi Fang
Summary: This study uses computer science and statistical principles to evaluate the effectiveness of the linear random effect model, employing Lasso variable selection techniques. It assesses the model's consistency, prediction accuracy, stability, and efficiency through numerical simulation and empirical research. A novel approach is used to assess variable selection consistency, employing the angle between the actual coefficient vector β and the estimated coefficient vector β. The study also utilizes statistical tools, such as the boxplot, to visually represent prediction accuracy and variable selection consistency. The proposed method is compared to commonly used analysis methods, demonstrating its effectiveness and convenience in analyzing model stability and efficiency.
PEERJ COMPUTER SCIENCE
(2023)
Article
Statistics & Probability
Yi Liu, Veronika Rockova, Yuexi Wang
Summary: The study abandons the linear model framework and turns to tree-based methods for variable selection, proposing a Bayesian tree-based probabilistic method that shows consistency under certain conditions. Additionally, a new ABC sampling method based on data-splitting is introduced to achieve higher acceptance rates, successfully identifying variables with high marginal inclusion probabilities. This research provides a new avenue towards approximating the median probability model in non-parametric setups where the marginal likelihood is intractable.
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY
(2021)
Article
Computer Science, Artificial Intelligence
Tino Werner
Summary: Contamination can distort estimators, but robustness can address this issue. However, there is little discussion on the relationship between contamination and distorted variable selection in the literature. Many methods for sparse model selection, such as Stability Selection, have been proposed. We introduce the variable selection breakdown point to measure the number of contaminated cases or cells required to detect no relevant variables. By combining the variable selection breakdown point with resampling, we quantify the robustness of Stability Selection. Our trimmed Stability Selection method aggregates only the models with the best performance, reducing the impact of heavily contaminated resamples.
Article
Computer Science, Theory & Methods
Diederik S. Laman Trip, Wessel N. van Wieringen
Summary: Efficient computation of penalized estimators for multivariate exponential family distributions, including Markov random fields with mixed variable types, is studied. The model parameter is estimated through maximizing pseudo-likelihood with a convex penalty, leading to a consistent estimator. A computationally efficient parallel Newton-Raphson algorithm is introduced for numerical evaluation of the estimator, with considerations for convergence.
STATISTICS AND COMPUTING
(2021)