Article
Computer Science, Artificial Intelligence
Siddharth Ramchandran, Gleb Tikhonov, Otto Lonnroth, Pekka Tiikkainen, Harri Lahdesmaki
Summary: Conditional variational autoencoders (CVAEs) are versatile deep latent variable models that extend the standard VAE framework by conditioning the generative model with auxiliary covariates. This paper proposes a method to learn conditional VAEs from datasets with missing values in auxiliary covariates, and demonstrates superior performance compared to previous methods in various experimental settings.
PATTERN RECOGNITION
(2024)
Article
Mathematics, Applied
Jin Jin, Peng Ye, Liuquan Sun
Summary: In this article, a class of weighted estimating equations is proposed for handling missing covariate data in biomedical studies. The approach effectively addresses the estimation of selection probabilities in both parametric and non-parametric modeling schemes.
SCIENCE CHINA-MATHEMATICS
(2022)
Review
Information Science & Library Science
Jiaxu Peng, Jungpil Hahn, Ke-Wei Huang
Summary: Missing values are an important and pervasive problem in the field of information systems, and a review of missing value theory is necessary to promote rigorous research practice.
INFORMATION SYSTEMS RESEARCH
(2023)
Article
Statistics & Probability
Weili Cheng, Xiaorui Li, Xiaoxia Li, Xiaodong Yan
Summary: We propose a frequentist model averaging approach for prediction in generalized linear models with missing covariates. Instead of imputing missing covariate data, we adjust the model averaging criterion by selecting a batch of candidate models with available covariate patterns. The weights for the candidate models are estimated using leave-one-out cross-validation on fully observed data. The proposed method is shown to be asymptotically optimal under regular conditions. Simulation studies and application to a real data set demonstrate the finite sample performance and effectiveness of our approach.
Article
Statistics & Probability
Hejian Sang, Jae Kwang Kim, Danhyang Lee
Summary: This article proposes a novel method of semiparametric fractional imputation (SFI) using Gaussian mixture models to handle missing data. The proposed method is computationally efficient and leads to robust estimation. Simulation studies are conducted to validate the performance of the proposed method.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
(2022)
Article
Biology
Hairu Wang, Zhiping Lu, Yukun Liu
Summary: Missing data can be divided into three categories: missing completely at random(MCAR), missing at random (MAR), and missing not at random (MNAR). Valid statistical approaches depend on correctly identifying the underlying missingness mechanism. This paper proposes two score tests based on a logistic model and a semiparametric location model to distinguish between the MAR and MNAR mechanisms. The simulation and analysis of HIV data demonstrate the effectiveness of the score tests.
Article
Education & Educational Research
Daniel Y. Y. Lee, Jeffrey R. Harring
Summary: A Monte Carlo simulation was conducted to compare methods for handling missing data in growth mixture models. The five methods considered in the study were: fully Bayesian approach using a Gibbs sampler, full information maximum likelihood with EM algorithm, multiple imputation, two-stage multiple imputation, and listwise deletion. Results showed that the Bayesian approach and two-stage multiple imputation methods generally had less biased parameter estimates compared to maximum likelihood or single imputation methods, but there were also some key differences. The study highlights similarities and disparities among the methods and provides general recommendations.
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS
(2023)
Article
Mathematical & Computational Biology
Jihyun Lee, S. Natasha Beretvas
Summary: The study compares four methods for handling missing covariates in meta-regression, suggesting the use of multiple imputation and full information maximum likelihood in practice, while also noting the challenges and potential advantages of using multiple imputation in the meta-analysis context.
RESEARCH SYNTHESIS METHODS
(2023)
Article
Mathematics, Interdisciplinary Applications
Roderick J. Little
Summary: This paper reviews assumptions about missing data mechanisms and discusses statistical analysis methods related to missing data, including Rubin's MAR definition and its limitations, as well as some sufficient conditions. It also explores other definitions and methods related to missing data, and presents an argument for weakening the conditions for frequentist maximum likelihood inference.
ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 8, 2021
(2021)
Article
Physics, Multidisciplinary
Zhenhuan Wu, Xingde Duan, Wenzhuan Zhang
Summary: Under the Bayesian framework, this study proposes a Tweedie compound Poisson partial linear mixed model for longitudinal semicontinuous data with nonignorable missing covariates and responses. The missing response and covariate mechanisms are specified using a logistic regression model. A hybrid algorithm combining the Gibbs sampler and the Metropolis-Hastings algorithm is employed to produce the joint Bayesian estimates. Several simulation studies and a real example with osteoarthritis data are provided to illustrate the proposed methodologies.
Article
Biology
Yang Liu, Yukun Liu, Pengfei Li, Lin Zhu
Summary: The paper introduces a maximum empirical likelihood estimation method for estimating abundance in the presence of missing covariates, showing it has smaller mean square error in simulations and more accurate coverage probabilities for confidence intervals than existing methods.
Article
Computer Science, Interdisciplinary Applications
Xianwen Ding, Jinhan Xie, Xiaodong Yan
Summary: This paper presents a model averaging estimation procedure for multiple quantile regression with missing covariates data, aiming to improve prediction accuracy. It constructs a set of candidate models based on missingness data patterns, with weights determined by leave-one-out cross-validation. Simulation studies demonstrate the advantages of this approach over existing methods.
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
(2021)
Article
Physics, Multidisciplinary
Burim Ramosaj, Justus Tulowietzki, Markus Pauly
Summary: This study analyzes the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when using machine learning methods. The results show that even a slight decrease in imputation accuracy can seriously affect prediction accuracy.
Article
Health Care Sciences & Services
Boyu Ren, Stuart R. Lipsitz, Roger D. Weiss, Garrett M. Fitzmaurice
Summary: Although there are well-developed methods for handling missing data in longitudinal studies with monotone missingness patterns, there are fewer methods available for non-monotone missingness. This study proposes a multiple imputation approach under a no self-censoring mechanism to handle non-monotone missing not at random data. Simulation and asymptotic studies are conducted to investigate the performance of the proposed imputation approach, and extensions to non-binary data settings are discussed.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2023)
Article
Social Sciences, Mathematical Methods
Shu-Hui Hsieh, Shen-Ming Lee, Chin-Shang Li
Summary: Surveys of income are complicated due to the sensitive nature of the topic. Researchers propose a two-stage multilevel randomized response technique to investigate true income levels and protect privacy. They use models and methods to handle missing data and evaluate the effectiveness through simulation studies.
SOCIOLOGICAL METHODS & RESEARCH
(2022)