Article
Statistics & Probability
Emmanuel O. Ogundimu
Summary: Sample selection issues arise when the outcome of interest is partially observed, leading to the challenge of requiring exclusion restrictions and potentially including irrelevant variables in the model. In the absence of expert knowledge about variables to include, the use of adaptive Lasso for variable selection and parameter estimation simultaneously in selection and outcome submodels is a proposed solution.
STATISTICAL PAPERS
(2022)
Article
Computer Science, Information Systems
Jakob A. Dambon, Fabio Sigrist, Reinhard Furrer
Summary: Spatially varying coefficient models are used to analyze spatial data with varying covariate effects. This paper proposes a new variable selection approach for Gaussian process-based SVC models, which is validated through simulation and real-world data analysis. The proposed approach yields sparser models and achieves better performance compared to classical maximum likelihood estimation.
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE
(2022)
Article
Engineering, Industrial
Cheoljoon Jeong, Xiaolei Fang
Summary: The proposed method introduces a novel penalized matrix regression methodology to diagnose the root cause of product quality defects in multistage manufacturing processes. By decomposing the unknown regression coefficient matrix into two factor matrices and penalizing their rows and columns simultaneously, sparsity is induced effectively. The Block Coordinate Proximal Descent (BCPD) optimization algorithm is developed for parameter estimation and solving convex sub-optimization problems cyclically.
Article
Mathematical & Computational Biology
Trias W. Rakhmawati, Il Do Ha, Hangbin Lee, Youngjo Lee
Summary: Competing risks data arises when an event precludes the observation of other events, often found in clustered clinical studies like multi-center clinical trials. A variable selection procedure for fixed effects in cause-specific competing risks frailty models using penalized h-likelihood is proposed, showing promising results in simulation studies. The method is illustrated using clustered competing-risks cancer data sets.
STATISTICS IN MEDICINE
(2021)
Article
Computer Science, Interdisciplinary Applications
Sahir R. Bhatnagar, Tianyuan Lu, Amanda Lovato, David L. Olds, Michael S. Kobor, Michael J. Meaney, Kieran O'Donnell, Archer Y. Yang, Celia M. T. Greenwood
Summary: This article presents a method called sail for detecting non-linear interactions in high-dimensional settings. The method is proven to have the oracle property, performing as well as if the true model were known in advance. Simulation results show that sail outperforms existing penalized regression methods in terms of prediction accuracy and support recovery.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2023)
Article
Mathematics
Fan Feng, Guanghui Cheng, Jianguo Sun
Summary: In this paper, a penalized variable selection technique based on Cox's proportional hazards model is developed for length-biased data with interval censoring. The proposed method is applied to real data and outperforms traditional variable selection methods based on conditional likelihood.
Review
Health Care Sciences & Services
Fatima-Zahra Jaouimaa, Il Do Ha, Kevin Burke
Summary: Standard survival models have limited flexibility as they only contain a single regression component. In contrast, the multi-parameter regression approach allows for simultaneous consideration of multiple distributional parameters with relatively low model complexity. However, variable selection methods in this approach are underdeveloped. Therefore, we propose penalized multi-parameter regression estimation procedures and compare them using simulation studies and an application to real data.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2023)
Article
Computer Science, Theory & Methods
Meadhbh O'Neill, Kevin Burke
Summary: Modern variable selection procedures utilize penalization methods to simultaneously perform model selection and estimation. The least absolute shrinkage and selection operator is a popular method that requires selecting a tuning parameter value. In contrast, our approach based on the smooth IC automatically selects the tuning parameter in one step. We also extend this method to the distributional regression framework, which offers more flexibility than classical regression modeling.
STATISTICS AND COMPUTING
(2023)
Article
Mathematics
Xuan Liu, Jianbao Chen
Summary: This paper discusses variable selection in the spatial autoregressive model using a penalized quasi-likelihood approach, and demonstrates the effectiveness of the method in identifying spatial effects, reducing dimensionality, and estimating parameters. The theoretical results confirm the method's ability to effectively recognize spatial effects and perform well in practice.
Article
Mathematics
Haofeng Wang, Xuejun Jiang, Min Zhou, Jiancheng Jiang
Summary: This paper studies variable selection in distributed sparse regression with large sample size and limited memory constraint. By improving the traditional divide-and-conquer method, the proposed method can better control the false discovery rate and reduce the computational burden. Theoretical properties and computational algorithms are established, and the method is evaluated through simulations and a real example.
COMMUNICATIONS IN MATHEMATICS AND STATISTICS
(2023)
Article
Multidisciplinary Sciences
Eliana Lima, Robert Hyde, Martin Green
Summary: This study developed an adaptable method for identifying causal factors from high-dimensional data by evaluating six variable selection methods, comparing results using graphical methods, and conducting formal triangulation to combine results from different methods.
SCIENTIFIC REPORTS
(2021)
Article
Mathematics
Miaojie Xia, Yuqi Zhang, Ruiqin Tian
Summary: This paper studies the variable selection problem of high-dimensional spatial autoregressive panel models with fixed effects. It proposes a matrix transformation method to eliminate the fixed effects and develops a penalized quasi-maximum likelihood approach for variable selection and parameter estimation in the transformed panel model. The consistency and oracle properties of the proposed estimator are established under certain regular conditions. Monte-Carlo experiments and a real data analysis demonstrate the satisfactory performance of the proposed variable selection method.
JOURNAL OF MATHEMATICS
(2023)
Article
Computer Science, Artificial Intelligence
Chunming Zhang, Lixing Zhu, Yanbo Shen
Summary: In this paper, a new family of robust Bregman divergences called robust-BD is proposed, which is less sensitive to data outliers. The performance of the proposed penalized robust-BD estimate is evaluated through extensive numerical experiments and compared with classical approaches, showing improvements on existing methods.
Article
Computer Science, Software Engineering
Yanxi Xie, Yuewen Li, Victor Shi, Quan Lu
Summary: Variable selection is crucial in data mining, especially in high-dimensional settings. This article proposes an orthogonal matching pursuit algorithm for variable screening, which performs well in filtering relevant variables and reducing computational cost.
SCIENTIFIC PROGRAMMING
(2022)
Article
Biotechnology & Applied Microbiology
Heather J. Zhou, Lei Li, Yumei Li, Wei Li, Jingyi Jessica Li
Summary: This study benchmarks popular hidden variable inference methods and finds that principal component analysis (PCA) not only underlies the statistical methodology behind these methods, but is also faster, better-performing, and easier to interpret and use. To help researchers use PCA in their QTL analysis, the authors provide an R package and a detailed guide, believing that using PCA will substantially improve and simplify hidden variable inference in QTL mapping, as well as increase the transparency and reproducibility of QTL research.