☆ 4.5 Article

Model validation and influence diagnostics for regression models with missing covariates

STATISTICS IN MEDICINE (2018)

期刊

STATISTICS IN MEDICINE

卷 37, 期 8, 页码 1325-1342

出版社

WILEY

DOI: 10.1002/sim.7584

关键词

goodness-of-fit test; influence diagnostics; missing covariates; model validation; multiple imputation; residual analysis

类别

Mathematical & Computational Biology Public, Environmental & Occupational Health Medical Informatics Medicine, Research & Experimental Statistics & Probability

向作者/读者索取更多资源

Protocol

Reagent

摘要

Missing covariate values are prevalent in regression applications. While an array of methods have been developed for estimating parameters in regression models with missing covariate data for a variety of response types, minimal focus has been given to validation of the response model and influence diagnostics. Previous research has mainly focused on estimating residuals for observations with missing covariates using expected values, after which specialized techniques are needed to conduct proper inference. We suggest a multiple imputation strategy that allows for the use of standard methods for residual analyses on the imputed data sets or a stacked data set. We demonstrate the suggested multiple imputation method by analyzing the Sleep in Mammals data in the context of a linear regression model and the New York Social Indicators Status data with a logistic regression model.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Health Care Sciences & Services

Goodness-of-fit tests for a logistic regression model with missing covariates

Shen-Ming Lee, Phuoc-Loc Tran, Chin-Shang Li

Summary: This paper addresses the issue of model checking for logistic regression with covariates missing at random. Two goodness-of-fit tests, Pearson chi-squared and unweighted residual sum-of-squares tests, are proposed and their test statistics are centered using inverse probability weighting (IPW) and nonparametric multiple imputation (MI) methods to solve the missing value problem. The paper establishes the asymptotic properties of these test statistics and introduces the IPW method and bootstrap re-sampling approaches to estimate the variances of the proposed test statistics. Simulation studies and real data examples are conducted to evaluate the performance of the proposed tests.

STATISTICAL METHODS IN MEDICAL RESEARCH (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Learning conditional variational autoencoders with missing covariates

Siddharth Ramchandran, Gleb Tikhonov, Otto Lonnroth, Pekka Tiikkainen, Harri Lahdesmaki

Summary: Conditional variational autoencoders (CVAEs) are versatile deep latent variable models that extend the standard VAE framework by conditioning the generative model with auxiliary covariates. This paper proposes a method to learn conditional VAEs from datasets with missing values in auxiliary covariates, and demonstrates superior performance compared to previous methods in various experimental settings.

PATTERN RECOGNITION (2024)

添加到收藏夹

Article Economics

Imputations for High Missing Rate Data in Covariates Via Semi-supervised Learning Approach

Wei Lan, Xuerong Chen, Tao Zou, Chih-Ling Tsai

Summary: This study introduces an innovative method for imputing missing data in high missing-rate covariates inspired by semi-supervised learning concept, without imposing any model assumptions, providing closed-form imputation for continuous covariates, and also applicable for discrete covariates.

JOURNAL OF BUSINESS & ECONOMIC STATISTICS (2022)

添加到收藏夹

Article Statistics & Probability

Goodness-of-fit tests for quantile regression with missing responses

Ana Perez-Gonzalez, Tomas R. Cotos-Yanez, Wenceslao Gonzalez-Manteiga, Rosa M. Crujeiras-Casais

Summary: This paper introduces and analyzes goodness-of-fit tests for quantile regression models in the presence of missing observations in the response variable, based on construction of empirical processes and considering three different approaches. The performance of different test statistics is extensively studied through simulation, with an application to real data included.

STATISTICAL PAPERS (2021)

添加到收藏夹

Article Physics, Multidisciplinary

On the Relation between Prediction and Imputation Accuracy under Missing Covariates

Burim Ramosaj, Justus Tulowietzki, Markus Pauly

Summary: This study analyzes the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when using machine learning methods. The results show that even a slight decrease in imputation accuracy can seriously affect prediction accuracy.

ENTROPY (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Missing value estimation using clustering and deep learning within multiple imputation framework

Manar D. Samad, Sakib Abrar, Norou Diawara

Summary: This paper proposes methods to improve the imputation accuracy of the MICE algorithm by using ensemble learning and deep neural networks. The results of extensive analyses on multiple datasets show that the proposed methods outperform other state-of-the-art imputation algorithms, leading to better imputation accuracy and classification accuracy.

KNOWLEDGE-BASED SYSTEMS (2022)

添加到收藏夹

Article Health Care Sciences & Services

Multiple imputation with missing data indicators

Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor

Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.

STATISTICAL METHODS IN MEDICAL RESEARCH (2021)

添加到收藏夹

Article Statistics & Probability

Kernel machines with missing covariates

Tiantian Liu, Yair Goldberg

Summary: We propose a family of doubly robust kernel machines for classification in the presence of missing covariates. The model features double robustness against misspecification and computation feasibility through the use of a novel convex augmented loss function and various techniques. The theoretical results and empirical analysis demonstrate the performance of the proposed kernel machine.

ELECTRONIC JOURNAL OF STATISTICS (2023)

添加到收藏夹

Article Mathematics, Interdisciplinary Applications

An Efficient Multiple Imputation Approach for Estimating Equations with Response Missing at Random and High-Dimensional Covariates

Lei Wang, Siying Sun, Zheng Xia

Summary: This paper introduces an empirical likelihood-based inference for parameters defined by the general estimating equations, showing consistency and asymptotic normality of the resulting estimator. The authors propose a two-stage estimation procedure using dimension-reduced kernel estimators and AIPW-MI methods, demonstrating the finite-sample performance through simulation and application to HIV-CD4 data.

JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY (2021)

添加到收藏夹

Article Urology & Nephrology

A practical guide to multiple imputation of missing data in nephrology

Katrina Blazek, Anita van Zwieten, Valeria Saglimbene, Armando Teixeira-Pinto

Summary: Health data often have missing values, and utilizing multiple imputation techniques can help reduce bias and maintain sample size. Correct specification of the imputation model is crucial for the validity of analyses. Considerations such as missing mechanism, imputation method, and result reporting are important when conducting research with multiply imputed data.

KIDNEY INTERNATIONAL (2021)

添加到收藏夹

Article Health Care Sciences & Services

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning

Liangyuan Hu, Jung-Yi Joyce Lin, Jiayi Ji

Summary: This research investigates a general variable selection approach that can handle missing covariates and outcomes, utilizing machine learning models and bootstrap imputation. Results suggest that extreme gradient boosting and Bayesian additive regression trees have the best overall variable selection performance.

STATISTICAL METHODS IN MEDICAL RESEARCH (2021)

添加到收藏夹

Article Computer Science, Information Systems

Imputation of Missing Clinical Covariates for Downstream Classification Problems

Benjamin Agbo, Hussain Al-Aqrabi, Tariq Alsboui, Muhammad Hussain, Richard Hill

Summary: The growth of intelligent devices has generated vast amounts of data, but there are often missing values in these datasets. Traditional approaches for handling missing values are not practical, so there is a need to develop strategies for imputation. This study proposes an imputation strategy called $med.BFMVI$ which shows the best performance in replacing missing values and improving downstream learning.

IEEE ACCESS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Multiple imputation method of missing credit risk assessment data based on generative adversarial networks

Feng Zhao, Yan Lu, Xinning Li, Lina Wang, Yingjie Song, Deming Fan, Caiming Zhang, Xiaobo Chen

Summary: Credit risk assessment is crucial for banks in loan approval and risk management. However, missing credit risk data can significantly reduce the effectiveness of the assessment model. In this paper, a novel method named MGAIN is proposed to accurately predict missing data through subset selection and multiple imputation strategy, improving the accuracy of the imputation model.

APPLIED SOFT COMPUTING (2022)

添加到收藏夹

Article Computer Science, Information Systems

Goodness-of-Fit Tests on Manifolds

Alexander Shapiro, Yao Xie, Rui Zhang

Summary: The research develops a general theory for the goodness-of-fit test to non-linear models, where the residual of the model fit follows a chi(2) distribution related to the model order and problem dimension. A sequential method for selecting model orders is presented, demonstrating broad applications in machine learning and signal processing.

IEEE TRANSACTIONS ON INFORMATION THEORY (2021)

添加到收藏夹

Article Mathematics

A Novel Goodness-of-Fit Test for Cauchy Distribution

A. Pekgor

Summary: Recently, new goodness-of-fit tests based on Kullback-Leibler divergence and likelihood ratio have been introduced for the Cauchy distribution, claiming to be more powerful than traditional tests. This study proposes a novel test for the Cauchy distribution and derives its asymptotic null distribution. Critical values are determined through Monte Carlo simulation for various sample sizes, and power analysis reveals the superiority of the proposed test under certain conditions.

JOURNAL OF MATHEMATICS (2023)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.