Article
Public, Environmental & Occupational Health
Chinchin Wang, Tyrel Stokes, Russell J. Steele, Niels Wedderkopp, Ian Shrier
Summary: Researchers demonstrated that random hot deck imputation can achieve plausible multiple imputation in longitudinal studies, serving as an alternative method when model-based approaches are infeasible.
CLINICAL EPIDEMIOLOGY
(2022)
Article
Environmental Sciences
Steven Pan, Sixia Chen
Summary: Sample estimates derived from data with missing values may be unreliable due to nonresponse bias, so imputation methods are often preferred. This study compared three popular imputation methods for handling multivariate missing data under different missing patterns. Limited simulation results demonstrated the effect of each imputation method on reducing bias and increasing efficiency for the parameter estimate of interest. Although these methods did not consistently outperform listwise deletion, they improved many descriptive and regression estimates when imputing all incomplete variables at once.
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH
(2023)
Article
Mathematical & Computational Biology
Daniel Westreich, Jeffrey S. A. Stringer
Summary: Inverse probability weighting is a useful method for correcting missing data. New estimators for nonmonotone missingness, including unconstrained maximum likelihood estimator (UMLE) and constrained Bayesian estimator (CBE), were introduced in 2018. This study compares the performance of these estimators with multiple imputation (MI) in the setting of an observational study, where inverse probability of treatment weights are used to address confounding.
STATISTICS IN MEDICINE
(2023)
Article
Health Care Sciences & Services
Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor
Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2021)
Editorial Material
Public, Environmental & Occupational Health
Stephen R. Cole, Paul N. Zivich, Jessie K. Edwards, Rachael K. Ross, Bonnie E. Shook-Sa, Joan T. Price, Jeffrey S. A. Stringer
Summary: Missing data is a common and significant problem in epidemiology, leading to decreased precision and notable bias. There are currently too few simple examples illustrating the types of missing data and their impact on results, and ignoring missing data remains a standard approach in epidemiology.
AMERICAN JOURNAL OF EPIDEMIOLOGY
(2023)
Article
Social Sciences, Mathematical Methods
Roderick J. Little, James R. Carpenter, Katherine J. Lee
Summary: This article discusses three common methods for addressing missing data: complete-case analysis, weighting, and multiple imputation. It provides a non-technical discussion of the strengths and weaknesses of these approaches, and when to choose each method. The methods are illustrated using data from the Youth Cohort (Time) Series for England, Wales and Scotland, 1984-2002.
SOCIOLOGICAL METHODS & RESEARCH
(2022)
Article
Social Sciences, Mathematical Methods
Rebecca Andridge, Laura Bechtel, Katherine Jenny Thompson
Summary: Detailed breakdowns of totals are often sparse and vary widely in proportions across units. Hot-deck imputation is used to fill in missing data and preserve multinomial distributions, but the best variant of the hot-deck method is unclear.
JOURNAL OF SURVEY STATISTICS AND METHODOLOGY
(2021)
Article
Urology & Nephrology
Katrina Blazek, Anita van Zwieten, Valeria Saglimbene, Armando Teixeira-Pinto
Summary: Health data often have missing values, and utilizing multiple imputation techniques can help reduce bias and maintain sample size. Correct specification of the imputation model is crucial for the validity of analyses. Considerations such as missing mechanism, imputation method, and result reporting are important when conducting research with multiply imputed data.
KIDNEY INTERNATIONAL
(2021)
Article
Anthropology
Jeffrey A. Smith, Jonathan H. Morgan, James Moody
Summary: Missing data is a common and challenging issue in network studies, and choosing the best imputation strategy depends on the type of missing data, the type of network, and the measure of interest.
Article
Computer Science, Artificial Intelligence
Martinez-Plumed Fernando, Ferri Cesar, Nieves David, Hernandez-Orallo Jose
Summary: This paper analyzes the relationship between missing values and algorithmic fairness in machine learning, indicating that rows containing missing values are usually fairer than the rest. The handling of missing values affects the trade-off between algorithm fairness and performance.
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
(2021)
Article
Computer Science, Artificial Intelligence
Feng Zhao, Yan Lu, Xinning Li, Lina Wang, Yingjie Song, Deming Fan, Caiming Zhang, Xiaobo Chen
Summary: Credit risk assessment is crucial for banks in loan approval and risk management. However, missing credit risk data can significantly reduce the effectiveness of the assessment model. In this paper, a novel method named MGAIN is proposed to accurately predict missing data through subset selection and multiple imputation strategy, improving the accuracy of the imputation model.
APPLIED SOFT COMPUTING
(2022)
Article
Mathematics
Fangfang Li, Hui Sun, Yu Gu, Ge Yu
Summary: This paper proposes a noise-aware missing data multiple imputation algorithm NPMI for static data. Different multiple imputation models are proposed according to the missing mechanism of data. The method to determine the imputation order of multivariablesmissing is given. Experiments on real and synthetic datasets verify the accuracy and efficiency of the proposed algorithm.
Article
Business
Lean Yu, Rongtian Zhou, Rongda Chen, Kin Keung Lai
Summary: Missing data has become a serious problem in credit risk classification and this study proposes a one-hot encoding-based data preprocessing method to solve this problem. The experimental results suggest that the proposed method performs the best with high missing rates and the random sample imputation method performs better with low missing rates.
EMERGING MARKETS FINANCE AND TRADE
(2022)
Article
Engineering, Multidisciplinary
Han Honggui, Sun Meiting, Wu Xiaolong, Li Fangyu
Summary: This article proposes a double-cycle weighted imputation (DCWI) method to deal with multiple missing patterns in the wastewater treatment process. The method maximizes the utilization of available information to improve imputation accuracy and experimental results show its superiority over comparison methods.
SCIENCE CHINA-TECHNOLOGICAL SCIENCES
(2022)
Article
Multidisciplinary Sciences
Hannah Voss, Simon Schlumbohm, Philip Barwikowski, Marcus Wurlitzer, Matthias Dottermusch, Philipp Neumann, Hartmut Schlueter, Julia E. Neumann, Christoph Krisp
Summary: HarmonizR is an efficient tool for missing data tolerant experimental variance reduction, which does not require data imputation and can be easily adjusted for individual dataset properties and user preferences. It demonstrated successful data harmonization for different tissue preservation techniques, LC-MS/MS instrumentation setups, and quantification approaches, and outperformed data imputation methods in detecting significant proteins.
NATURE COMMUNICATIONS
(2022)