☆ 4.2 Article

Missing data: Discussion points from the PSI missing data expert group

PHARMACEUTICAL STATISTICS (2010)

Journal

PHARMACEUTICAL STATISTICS

Volume 9, Issue 4, Pages 288-297

Publisher

WILEY

DOI: 10.1002/pst.391

Keywords

missing data; LOCF; MMRM; multiple imputation

Categories

Pharmacology & Pharmacy Statistics & Probability

Funding

ESRC [ES/G026300/1] Funding Source: UKRI
MRC [MC_U105260558] Funding Source: UKRI
Economic and Social Research Council [ES/G026300/1] Funding Source: researchfish
Medical Research Council [MC_U105260558] Funding Source: researchfish
Medical Research Council [MC_U105260558] Funding Source: Medline

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The Points to Consider Document on Missing Data was adopted by the Committee of Health and Medicinal Products (CHMP) in December 2001. In September 2007 the CHMP issued a recommendation to review the document, with particular emphasis on summarizing and critically appraising the pattern of drop-outs, explaining the role and limitations of the 'last observation carried forward' method and describing the CHMP's cautionary stance on the use of mixed models. In preparation for the release of the updated guidance document, statisticians in the Pharmaceutical Industry held a one-day expert group meeting in September 2008. Topics that were debated included minimizing the extent of missing data and understanding the missing data mechanism, defining the principles for handling missing data and understanding the assumptions underlying different analysis methods. A clear message from the meeting was that at present, biostatisticians tend only to react to missing data. Limited pro-active planning is undertaken when designing clinical trials. Missing data mechanisms for a trial need to be considered during the planning phase and the impact on the objectives assessed. Another area for improvement is in the understanding of the pattern of missing data observed during a trial and thus the missing data mechanism via the plotting of data; for example, use of Kaplan-Meier curves looking at time to withdrawal. Copyright (C) 2009 John Wiley & Sons, Ltd.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Sport Sciences

Missing Data in Sport Science: A Didactic Example Using Wearables in American Football

Matthew S. Tenan

Summary: Data collection in the real-world is challenging and often results in missing data due to various reasons. However, the importance of proper handling of missing data for unbiased analysis and decision making is often overlooked in sport science and medicine. This article aims to demonstrate the potential problems caused by missing data and proposes some imputation solutions to maintain the integrity of the data.

SPORTS MEDICINE (2023)

Add to Collection

Article Health Care Sciences & Services

Multiple imputation with missing data indicators

Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor

Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.

STATISTICAL METHODS IN MEDICAL RESEARCH (2021)

Add to Collection

Article Urology & Nephrology

A practical guide to multiple imputation of missing data in nephrology

Katrina Blazek, Anita van Zwieten, Valeria Saglimbene, Armando Teixeira-Pinto

Summary: Health data often have missing values, and utilizing multiple imputation techniques can help reduce bias and maintain sample size. Correct specification of the imputation model is crucial for the validity of analyses. Considerations such as missing mechanism, imputation method, and result reporting are important when conducting research with multiply imputed data.

KIDNEY INTERNATIONAL (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Multiple imputation method of missing credit risk assessment data based on generative adversarial networks

Feng Zhao, Yan Lu, Xinning Li, Lina Wang, Yingjie Song, Deming Fan, Caiming Zhang, Xiaobo Chen

Summary: Credit risk assessment is crucial for banks in loan approval and risk management. However, missing credit risk data can significantly reduce the effectiveness of the assessment model. In this paper, a novel method named MGAIN is proposed to accurately predict missing data through subset selection and multiple imputation strategy, improving the accuracy of the imputation model.

APPLIED SOFT COMPUTING (2022)

Add to Collection

Article Mathematics

A Noise-Aware Multiple Imputation Algorithm for Missing Data

Fangfang Li, Hui Sun, Yu Gu, Ge Yu

Summary: This paper proposes a noise-aware missing data multiple imputation algorithm NPMI for static data. Different multiple imputation models are proposed according to the missing mechanism of data. The method to determine the imputation order of multivariablesmissing is given. Experiments on real and synthetic datasets verify the accuracy and efficiency of the proposed algorithm.

MATHEMATICS (2023)

Add to Collection

Article Engineering, Multidisciplinary

Double-cycle weighted imputation method for wastewater treatment process data with multiple missing patterns

Han Honggui, Sun Meiting, Wu Xiaolong, Li Fangyu

Summary: This article proposes a double-cycle weighted imputation (DCWI) method to deal with multiple missing patterns in the wastewater treatment process. The method maximizes the utilization of available information to improve imputation accuracy and experimental results show its superiority over comparison methods.

SCIENCE CHINA-TECHNOLOGICAL SCIENCES (2022)

Add to Collection

Article Multidisciplinary Sciences

HarmonizR enables data harmonization across independent proteomic datasets with appropriate handling of missing values

Hannah Voss, Simon Schlumbohm, Philip Barwikowski, Marcus Wurlitzer, Matthias Dottermusch, Philipp Neumann, Hartmut Schlueter, Julia E. Neumann, Christoph Krisp

Summary: HarmonizR is an efficient tool for missing data tolerant experimental variance reduction, which does not require data imputation and can be easily adjusted for individual dataset properties and user preferences. It demonstrated successful data harmonization for different tissue preservation techniques, LC-MS/MS instrumentation setups, and quantification approaches, and outperformed data imputation methods in detecting significant proteins.

NATURE COMMUNICATIONS (2022)

Add to Collection

Article Health Care Sciences & Services

Handling missing data in clinical research

Martijn W. Heymans, Jos W. R. Twisk

Summary: Proper handling of missing data is crucial, and consideration should be given to the mechanism of missing data. Multiple imputations are highly recommended for estimating missing values. It is important to prevent missing data rather than treating them.

JOURNAL OF CLINICAL EPIDEMIOLOGY (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

Missing Data Analysis in Regression

C. G. Marcelino, G. M. C. Leite, P. Celes, C. E. Pedreira

Summary: This paper investigates the effects and possible solutions to incomplete databases in regression and provides a systematic view of how missing data may affect regression results by analyzing actual publicly available databases. The results indicate that the impact of missing data can be significant, and the K-Nearest Neighbors method performs better in regression with missing data.

APPLIED ARTIFICIAL INTELLIGENCE (2022)

Add to Collection

Article Ecology

Handling missing values in trait data

Thomas F. Johnson, Nick J. B. Isaac, Agustin Paviolo, Manuela Gonzalez-Suarez

Summary: The study evaluated the performance of approaches for handling missing values in biased datasets and found that imputation can effectively handle missing data in some conditions but is not always the best solution. None of the tested methods could effectively deal with severe biases, highlighting the importance of rigorous data checking and proposing variables to assist researchers in detecting and minimizing errors in incomplete datasets.

GLOBAL ECOLOGY AND BIOGEOGRAPHY (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Missing value estimation using clustering and deep learning within multiple imputation framework

Manar D. Samad, Sakib Abrar, Norou Diawara

Summary: This paper proposes methods to improve the imputation accuracy of the MICE algorithm by using ensemble learning and deep neural networks. The results of extensive analyses on multiple datasets show that the proposed methods outperform other state-of-the-art imputation algorithms, leading to better imputation accuracy and classification accuracy.

KNOWLEDGE-BASED SYSTEMS (2022)

Add to Collection

Article Multidisciplinary Sciences

Adaptive kernel fuzzy clustering for missing data

Anny K. G. Rodrigues, Raydonal Ospina, Marcelo R. P. Ferreira

Summary: This study proposes and evaluates a Kernel Fuzzy C-means clustering algorithm with local adaptive distances in dealing with missing data, showing better performance under the Partial Distance Strategy (PDS) and Optimal Completion Strategy (OCS) for clustering.

PLOS ONE (2021)

Add to Collection

Article Automation & Control Systems

NMVI: A data-splitting based imputation technique for distinct types of missing data

Hutashan Vishal Bhagat, Manminder Singh

Summary: This article introduces a novel technique for estimating missing values, which splits the dataset into complete and incomplete subsets and sets an upper limit for each class with missing data to estimate missing values more accurately. Experimental results demonstrate the efficient estimation capability of this technique in datasets with different dimensions and missing rates.

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2022)

Add to Collection

Article Computer Science, Artificial Intelligence

GAGIN: generative adversarial guider imputation network for missing data

Wei Wang, Yimeng Chai, Yue Li

Summary: In this paper, a novel generative adversarial guider imputation network (GAGIN) based on generative adversarial network (GAN) is proposed for missing data imputation. The comprehensive experiments show that the proposed method outperforms state-of-the-art approaches and traditional methods in terms of RMSE on both numeric datasets and image dataset.

NEURAL COMPUTING & APPLICATIONS (2022)

Add to Collection

Article Mathematical & Computational Biology

Propensity score matching after multiple imputation when a confounder has missing data

Corentin Segalas, Clemence Leyrat, James R. Carpenter, Elizabeth Williamson

Summary: One popular method for addressing confounding in causal inference is propensity score matching, which matches treated patients with untreated patients based on similar propensity scores. Multiple imputation is often used for handling missing data. However, combining propensity score matching and multiple imputation can result in over-coverage of the confidence interval for the treatment effect estimate. In this article, the authors investigate the cause of this over-coverage and propose a correction to remove it.

STATISTICS IN MEDICINE (2023)

Add to Collection

Editorial Material Pediatrics

Bed-sharing is a risk for sudden unexpected death in infancy

David Tappin, Edwin A. Mitchell, James Carpenter, Fern Hauck, Lynsay Allan

ARCHIVES OF DISEASE IN CHILDHOOD (2023)

Add to Collection

Article Mathematical & Computational Biology

Estimating interactions and subgroup-specific treatment effects in meta-analysis without aggregation bias: A within-trial framework

Peter J. Godolphin, Ian R. White, Jayne F. Tierney, David J. Fisher

Summary: Estimating within-trial interactions in meta-analysis is crucial for assessing treatment effects across participant subgroups. Current methods have limitations, so this article proposes practical methods to estimate interactions across multiple subgroups and present the data using forest plots. The proposed framework is demonstrated with examples and implemented in Stata. It can effectively demonstrate how treatment effects differ across participant subgroups.

RESEARCH SYNTHESIS METHODS (2023)

Add to Collection

Article Medicine, General & Internal

Knowledge user survey and Delphi process to inform development of a new risk of bias tool to assess systematic reviews with network meta-analysis (RoB NMA tool)

Carole Lunny, Areti Angeliki Veroniki, Brian Hutton, Ian White, Jpt Higgins, James M. Wright, Ji Yoon Kim, Sai Surabi Thirugnanasampanthar, Shazia Siddiqui, Jennifer Watt, Lorenzo Moja, Nichole Taske, Robert C. Lorenz, Savannah Gerrish, Sharon Straus, Virginia Minogue, Franklin Hu, Kevin Lin, Ayah Kapani, Samin Nagi, Lillian Chen, Mona Akbar-nejad, Andrea C. Tricco

Summary: This study aimed to develop a risk of bias (RoB) tool for assessing network meta-analysis (NMA) and gather opinions from knowledge users. The Delphi process and knowledge user survey identified the content of the RoB NMA tool and revealed a preference for assessing both individual NMA results and authors' conclusions.

BMJ EVIDENCE-BASED MEDICINE (2023)

Add to Collection

Review Cardiac & Cardiovascular Systems

Bed rest duration and complications after transfemoral cardiac catheterization: a network meta-analysis

Erica Busca, Chiara Airoldi, Fabio Bertoncini, Giulia Buratti, Roberta Casarotto, Samanta Gaboardi, Fabrizio Faggiano, Michela Barisone, Ian R. White, Elias Allara, Alberto Dal Molin

Summary: This study assessed the effects of bed rest duration on short-term complications following transfemoral catheterization. The results showed that a short bed rest was not associated with complications, but longer duration of bed rest increased the risk of back pain. Patients can safely ambulate as early as 2 hours after the procedure.

EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING (2023)

Add to Collection

Article Mathematical & Computational Biology

Measuring the performance of prediction models to personalize treatment choice

Orestis Efthimiou, Jeroen Hoogland, Thomas P. A. Debray, Michael Seo, Toshiaki A. Furukawa, Matthias Egger, Ian R. White

Summary: When individual patient data from a randomized trial are available, statistical and machine learning methods can be used to develop models for predicting treatment effects and guide personalized treatment choices. This article proposes measures to evaluate personalized treatment effect predictions, including discrimination and calibration. The methods are applicable to different outcome types and prediction models.

STATISTICS IN MEDICINE (2023)

Add to Collection

Article Medicine, Research & Experimental

A comparison of covariate adjustment approaches under model misspecification in individually randomized trials

Mia S. Tackney, Tim Morris, Ian White, Clemence Leyrat, Karla Diaz-Ordaz, Elizabeth Williamson

Summary: Adjusting for baseline covariates in randomized trials can lead to power gains and protect against chance imbalances. However, for continuous covariates, adjusted methods may misspecify the relationship between the covariate and outcome. Through simulation studies, we find that G-computation, IPTW, AIPTW, and TMLE offer improvement over ANCOVA when there is a non-linear interaction between treatment and a skewed covariate in small sample sizes.

TRIALS (2023)

Add to Collection

Article Medicine, Research & Experimental

Statistical analysis plan for the motor neuron disease systematic multi-arm adaptive randomised trial (MND-SMART)

Richard A. A. Parker, Christopher J. J. Weir, Tra My Pham, Ian R. R. White, Nigel Stallard, Mahesh K. B. Parmar, Robert J. J. Swingler, Rachel S. S. Dakin, Suvankar Pal, Siddharthan Chandran

Summary: MND-SMART is a multi-arm, multi-stage, multi-centre randomized controlled trial for motor neuron disease. It compares the efficacy of memantine and trazodone with placebo, and may introduce other investigational treatments later. The co-primary outcomes are ALS-FRS-R functional outcome and overall survival. The trial randomizes participants 1:1:1 to receive placebo or one of the investigational treatments, with a maximum of 531 participants. Comparisons will be conducted in four stages, with the opportunity to stop randomizations to poorly performing arms. The final analysis will be based on a statistical analysis plan finalized in May 2022.

TRIALS (2023)

Add to Collection

Article Clinical Neurology

Towards a multi-arm multi-stage platform trial of disease modifying approaches in Parkinson's disease

Tom Foltynie, Sonia Gandhi, Cristina Gonzalez-Robles, Marie-Louise Zeissler, Georgia Mills, Roger Barker, James Carpenter, Anette Schrag, Anthony Schapira, Oliver Bandmann, Stephen Mullin, Joy Duffen, Kevin McFarthing, Jeremy Chataway, Mahesh Parmar, Camille Carroll

Summary: Multi-arm, multi-stage platform designs have improved the efficiency of clinical trials in the field of oncology. Foltynie et al. discuss the challenges and considerations of using this approach to assess potential disease-modifying treatments in progressive neurological conditions such as Parkinson's disease.

BRAIN (2023)

Add to Collection

Article Medicine, Research & Experimental

Using modified intention-to-treat as a principal stratum estimator for failure to initiate treatment

Brennan C. Kahan, Ian R. White, Mark Edwards, Michael O. Harhay

Summary: A modified intention-to-treat analysis, which excludes participants who do not begin treatment, can estimate the treatment effect in the subpopulation of participants who would begin treatment regardless of the assigned arm. This estimator is unbiased if the intercurrent event is not affected by the treatment arm and if participants in both arms would initiate treatment. The criteria for unbiasedness are the ability to measure participants who experience the event in each treatment arm and the reasonable assumption that treatment allocation does not affect initiation.

CLINICAL TRIALS (2023)

Add to Collection

Article Medicine, Research & Experimental

Handling intercurrent events and missing data in non-inferiority trials using the estimand framework: A tuberculosis case study

Sunita Rehal, Suzie Cro, Patrick P. J. Phillips, Katherine Fielding, James R. Carpenter

Summary: This article discusses statistical methods for handling intercurrent events and missing values in non-inferiority studies. Using a tuberculosis clinical trial as a case study, the authors propose primary and additional estimands suitable for non-inferiority studies. Multiple imputation methods are used for estimation, and the results show that these methods provide a more accurate interpretation of the estimand.

CLINICAL TRIALS (2023)

Add to Collection

Article Health Care Sciences & Services

Culturally Adapting a Digital Intervention to Reduce Suicidal Ideation for Syrian Asylum Seekers and Refugees in the United Kingdom: Protocol for a Qualitative Study

Oliver Beuthin, Kamaldeep Bhui, Ly-Mee Yu, Sadiya Shahid, Louay Almidani, Mariah Malak Bilalaga, Roshan Hussein, Alnarjes Harba, Yasmine Nasser

Summary: The study aims to increase access to mental health treatment for Syrian asylum seekers and refugees in the UK by culturally adapting a digital intervention to reduce suicidal ideation. The study will use experience-based co-design and conduct interviews to understand their experiences and perceptions. The results will be published in December 2023.

JMIR RESEARCH PROTOCOLS (2023)

Add to Collection

Letter Mathematical & Computational Biology

Comment on Oberman & Vink: Should we fix or simulate the complete data in simulation studies evaluating missing data methods?

Tim P. Morris, Ian R. White, Suzie Cro, Jonathan W. Bartlett, James R. Carpenter, Tra My Pham

Summary: For simulation studies evaluating methods of handling missing data, generating partially observed data by fixing complete data and simulating missingness indicators repeatedly is rarely appropriate.

BIOMETRICAL JOURNAL (2023)

Add to Collection

Article Medicine, Research & Experimental

When should factorial designs be used for late-phase randomised controlled trials?

Ian R. White, Alexander J. Szubert, Babak Choodari-Oskooei, A. Sarah Walker, Mahesh K. B. Parmar

Summary: Factorial designs, which use a combination of randomization and multiple interventions, can reduce resource and participant requirements. However, several factors need to be considered before using this design, including clinical, practical, statistical, and external issues. Key considerations include the requirement for a lower sample size and minimal interaction effects between interventions.

CLINICAL TRIALS (2023)

Add to Collection

Article Health Care Sciences & Services

Improving transportability of randomized controlled trial inference using robust predictionmethods

Michael R. Elliott, Orlagh Carroll, Richard Grieve, James Carpenter

Summary: Randomized controlled trials are considered the gold standard for assessing causal effects, but may still be biased in estimating population-level causal effects. Recent research suggests that incorporating information from probability samples can improve population causal inference in randomized controlled trials. This paper reviews recent work on transporting causal effect estimates from trials to populations, and proposes estimators using inverse probability weighting or prediction methods. The proposed methods do not require specific functional form or interaction, and can accommodate unequal probability of selection in benchmark or population samples.

STATISTICAL METHODS IN MEDICAL RESEARCH (2023)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.