Article
Sport Sciences
Matthew S. Tenan
Summary: Data collection in the real-world is challenging and often results in missing data due to various reasons. However, the importance of proper handling of missing data for unbiased analysis and decision making is often overlooked in sport science and medicine. This article aims to demonstrate the potential problems caused by missing data and proposes some imputation solutions to maintain the integrity of the data.
Article
Health Care Sciences & Services
Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor
Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2021)
Article
Urology & Nephrology
Katrina Blazek, Anita van Zwieten, Valeria Saglimbene, Armando Teixeira-Pinto
Summary: Health data often have missing values, and utilizing multiple imputation techniques can help reduce bias and maintain sample size. Correct specification of the imputation model is crucial for the validity of analyses. Considerations such as missing mechanism, imputation method, and result reporting are important when conducting research with multiply imputed data.
KIDNEY INTERNATIONAL
(2021)
Article
Computer Science, Artificial Intelligence
Feng Zhao, Yan Lu, Xinning Li, Lina Wang, Yingjie Song, Deming Fan, Caiming Zhang, Xiaobo Chen
Summary: Credit risk assessment is crucial for banks in loan approval and risk management. However, missing credit risk data can significantly reduce the effectiveness of the assessment model. In this paper, a novel method named MGAIN is proposed to accurately predict missing data through subset selection and multiple imputation strategy, improving the accuracy of the imputation model.
APPLIED SOFT COMPUTING
(2022)
Article
Mathematics
Fangfang Li, Hui Sun, Yu Gu, Ge Yu
Summary: This paper proposes a noise-aware missing data multiple imputation algorithm NPMI for static data. Different multiple imputation models are proposed according to the missing mechanism of data. The method to determine the imputation order of multivariablesmissing is given. Experiments on real and synthetic datasets verify the accuracy and efficiency of the proposed algorithm.
Article
Engineering, Multidisciplinary
Han Honggui, Sun Meiting, Wu Xiaolong, Li Fangyu
Summary: This article proposes a double-cycle weighted imputation (DCWI) method to deal with multiple missing patterns in the wastewater treatment process. The method maximizes the utilization of available information to improve imputation accuracy and experimental results show its superiority over comparison methods.
SCIENCE CHINA-TECHNOLOGICAL SCIENCES
(2022)
Article
Multidisciplinary Sciences
Hannah Voss, Simon Schlumbohm, Philip Barwikowski, Marcus Wurlitzer, Matthias Dottermusch, Philipp Neumann, Hartmut Schlueter, Julia E. Neumann, Christoph Krisp
Summary: HarmonizR is an efficient tool for missing data tolerant experimental variance reduction, which does not require data imputation and can be easily adjusted for individual dataset properties and user preferences. It demonstrated successful data harmonization for different tissue preservation techniques, LC-MS/MS instrumentation setups, and quantification approaches, and outperformed data imputation methods in detecting significant proteins.
NATURE COMMUNICATIONS
(2022)
Article
Health Care Sciences & Services
Martijn W. Heymans, Jos W. R. Twisk
Summary: Proper handling of missing data is crucial, and consideration should be given to the mechanism of missing data. Multiple imputations are highly recommended for estimating missing values. It is important to prevent missing data rather than treating them.
JOURNAL OF CLINICAL EPIDEMIOLOGY
(2022)
Article
Computer Science, Artificial Intelligence
C. G. Marcelino, G. M. C. Leite, P. Celes, C. E. Pedreira
Summary: This paper investigates the effects and possible solutions to incomplete databases in regression and provides a systematic view of how missing data may affect regression results by analyzing actual publicly available databases. The results indicate that the impact of missing data can be significant, and the K-Nearest Neighbors method performs better in regression with missing data.
APPLIED ARTIFICIAL INTELLIGENCE
(2022)
Article
Ecology
Thomas F. Johnson, Nick J. B. Isaac, Agustin Paviolo, Manuela Gonzalez-Suarez
Summary: The study evaluated the performance of approaches for handling missing values in biased datasets and found that imputation can effectively handle missing data in some conditions but is not always the best solution. None of the tested methods could effectively deal with severe biases, highlighting the importance of rigorous data checking and proposing variables to assist researchers in detecting and minimizing errors in incomplete datasets.
GLOBAL ECOLOGY AND BIOGEOGRAPHY
(2021)
Article
Computer Science, Artificial Intelligence
Manar D. Samad, Sakib Abrar, Norou Diawara
Summary: This paper proposes methods to improve the imputation accuracy of the MICE algorithm by using ensemble learning and deep neural networks. The results of extensive analyses on multiple datasets show that the proposed methods outperform other state-of-the-art imputation algorithms, leading to better imputation accuracy and classification accuracy.
KNOWLEDGE-BASED SYSTEMS
(2022)
Article
Multidisciplinary Sciences
Anny K. G. Rodrigues, Raydonal Ospina, Marcelo R. P. Ferreira
Summary: This study proposes and evaluates a Kernel Fuzzy C-means clustering algorithm with local adaptive distances in dealing with missing data, showing better performance under the Partial Distance Strategy (PDS) and Optimal Completion Strategy (OCS) for clustering.
Article
Automation & Control Systems
Hutashan Vishal Bhagat, Manminder Singh
Summary: This article introduces a novel technique for estimating missing values, which splits the dataset into complete and incomplete subsets and sets an upper limit for each class with missing data to estimate missing values more accurately. Experimental results demonstrate the efficient estimation capability of this technique in datasets with different dimensions and missing rates.
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Wei Wang, Yimeng Chai, Yue Li
Summary: In this paper, a novel generative adversarial guider imputation network (GAGIN) based on generative adversarial network (GAN) is proposed for missing data imputation. The comprehensive experiments show that the proposed method outperforms state-of-the-art approaches and traditional methods in terms of RMSE on both numeric datasets and image dataset.
NEURAL COMPUTING & APPLICATIONS
(2022)
Article
Mathematical & Computational Biology
Corentin Segalas, Clemence Leyrat, James R. Carpenter, Elizabeth Williamson
Summary: One popular method for addressing confounding in causal inference is propensity score matching, which matches treated patients with untreated patients based on similar propensity scores. Multiple imputation is often used for handling missing data. However, combining propensity score matching and multiple imputation can result in over-coverage of the confidence interval for the treatment effect estimate. In this article, the authors investigate the cause of this over-coverage and propose a correction to remove it.
STATISTICS IN MEDICINE
(2023)
Editorial Material
Pediatrics
David Tappin, Edwin A. Mitchell, James Carpenter, Fern Hauck, Lynsay Allan
ARCHIVES OF DISEASE IN CHILDHOOD
(2023)
Article
Mathematical & Computational Biology
Peter J. Godolphin, Ian R. White, Jayne F. Tierney, David J. Fisher
Summary: Estimating within-trial interactions in meta-analysis is crucial for assessing treatment effects across participant subgroups. Current methods have limitations, so this article proposes practical methods to estimate interactions across multiple subgroups and present the data using forest plots. The proposed framework is demonstrated with examples and implemented in Stata. It can effectively demonstrate how treatment effects differ across participant subgroups.
RESEARCH SYNTHESIS METHODS
(2023)
Article
Medicine, General & Internal
Carole Lunny, Areti Angeliki Veroniki, Brian Hutton, Ian White, Jpt Higgins, James M. Wright, Ji Yoon Kim, Sai Surabi Thirugnanasampanthar, Shazia Siddiqui, Jennifer Watt, Lorenzo Moja, Nichole Taske, Robert C. Lorenz, Savannah Gerrish, Sharon Straus, Virginia Minogue, Franklin Hu, Kevin Lin, Ayah Kapani, Samin Nagi, Lillian Chen, Mona Akbar-nejad, Andrea C. Tricco
Summary: This study aimed to develop a risk of bias (RoB) tool for assessing network meta-analysis (NMA) and gather opinions from knowledge users. The Delphi process and knowledge user survey identified the content of the RoB NMA tool and revealed a preference for assessing both individual NMA results and authors' conclusions.
BMJ EVIDENCE-BASED MEDICINE
(2023)
Review
Cardiac & Cardiovascular Systems
Erica Busca, Chiara Airoldi, Fabio Bertoncini, Giulia Buratti, Roberta Casarotto, Samanta Gaboardi, Fabrizio Faggiano, Michela Barisone, Ian R. White, Elias Allara, Alberto Dal Molin
Summary: This study assessed the effects of bed rest duration on short-term complications following transfemoral catheterization. The results showed that a short bed rest was not associated with complications, but longer duration of bed rest increased the risk of back pain. Patients can safely ambulate as early as 2 hours after the procedure.
EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING
(2023)
Article
Mathematical & Computational Biology
Orestis Efthimiou, Jeroen Hoogland, Thomas P. A. Debray, Michael Seo, Toshiaki A. Furukawa, Matthias Egger, Ian R. White
Summary: When individual patient data from a randomized trial are available, statistical and machine learning methods can be used to develop models for predicting treatment effects and guide personalized treatment choices. This article proposes measures to evaluate personalized treatment effect predictions, including discrimination and calibration. The methods are applicable to different outcome types and prediction models.
STATISTICS IN MEDICINE
(2023)
Article
Medicine, Research & Experimental
Mia S. Tackney, Tim Morris, Ian White, Clemence Leyrat, Karla Diaz-Ordaz, Elizabeth Williamson
Summary: Adjusting for baseline covariates in randomized trials can lead to power gains and protect against chance imbalances. However, for continuous covariates, adjusted methods may misspecify the relationship between the covariate and outcome. Through simulation studies, we find that G-computation, IPTW, AIPTW, and TMLE offer improvement over ANCOVA when there is a non-linear interaction between treatment and a skewed covariate in small sample sizes.
Article
Medicine, Research & Experimental
Richard A. A. Parker, Christopher J. J. Weir, Tra My Pham, Ian R. R. White, Nigel Stallard, Mahesh K. B. Parmar, Robert J. J. Swingler, Rachel S. S. Dakin, Suvankar Pal, Siddharthan Chandran
Summary: MND-SMART is a multi-arm, multi-stage, multi-centre randomized controlled trial for motor neuron disease. It compares the efficacy of memantine and trazodone with placebo, and may introduce other investigational treatments later. The co-primary outcomes are ALS-FRS-R functional outcome and overall survival. The trial randomizes participants 1:1:1 to receive placebo or one of the investigational treatments, with a maximum of 531 participants. Comparisons will be conducted in four stages, with the opportunity to stop randomizations to poorly performing arms. The final analysis will be based on a statistical analysis plan finalized in May 2022.
Article
Clinical Neurology
Tom Foltynie, Sonia Gandhi, Cristina Gonzalez-Robles, Marie-Louise Zeissler, Georgia Mills, Roger Barker, James Carpenter, Anette Schrag, Anthony Schapira, Oliver Bandmann, Stephen Mullin, Joy Duffen, Kevin McFarthing, Jeremy Chataway, Mahesh Parmar, Camille Carroll
Summary: Multi-arm, multi-stage platform designs have improved the efficiency of clinical trials in the field of oncology. Foltynie et al. discuss the challenges and considerations of using this approach to assess potential disease-modifying treatments in progressive neurological conditions such as Parkinson's disease.
Article
Medicine, Research & Experimental
Brennan C. Kahan, Ian R. White, Mark Edwards, Michael O. Harhay
Summary: A modified intention-to-treat analysis, which excludes participants who do not begin treatment, can estimate the treatment effect in the subpopulation of participants who would begin treatment regardless of the assigned arm. This estimator is unbiased if the intercurrent event is not affected by the treatment arm and if participants in both arms would initiate treatment. The criteria for unbiasedness are the ability to measure participants who experience the event in each treatment arm and the reasonable assumption that treatment allocation does not affect initiation.
Article
Medicine, Research & Experimental
Sunita Rehal, Suzie Cro, Patrick P. J. Phillips, Katherine Fielding, James R. Carpenter
Summary: This article discusses statistical methods for handling intercurrent events and missing values in non-inferiority studies. Using a tuberculosis clinical trial as a case study, the authors propose primary and additional estimands suitable for non-inferiority studies. Multiple imputation methods are used for estimation, and the results show that these methods provide a more accurate interpretation of the estimand.
Article
Health Care Sciences & Services
Oliver Beuthin, Kamaldeep Bhui, Ly-Mee Yu, Sadiya Shahid, Louay Almidani, Mariah Malak Bilalaga, Roshan Hussein, Alnarjes Harba, Yasmine Nasser
Summary: The study aims to increase access to mental health treatment for Syrian asylum seekers and refugees in the UK by culturally adapting a digital intervention to reduce suicidal ideation. The study will use experience-based co-design and conduct interviews to understand their experiences and perceptions. The results will be published in December 2023.
JMIR RESEARCH PROTOCOLS
(2023)
Letter
Mathematical & Computational Biology
Tim P. Morris, Ian R. White, Suzie Cro, Jonathan W. Bartlett, James R. Carpenter, Tra My Pham
Summary: For simulation studies evaluating methods of handling missing data, generating partially observed data by fixing complete data and simulating missingness indicators repeatedly is rarely appropriate.
BIOMETRICAL JOURNAL
(2023)
Article
Medicine, Research & Experimental
Ian R. White, Alexander J. Szubert, Babak Choodari-Oskooei, A. Sarah Walker, Mahesh K. B. Parmar
Summary: Factorial designs, which use a combination of randomization and multiple interventions, can reduce resource and participant requirements. However, several factors need to be considered before using this design, including clinical, practical, statistical, and external issues. Key considerations include the requirement for a lower sample size and minimal interaction effects between interventions.
Article
Health Care Sciences & Services
Michael R. Elliott, Orlagh Carroll, Richard Grieve, James Carpenter
Summary: Randomized controlled trials are considered the gold standard for assessing causal effects, but may still be biased in estimating population-level causal effects. Recent research suggests that incorporating information from probability samples can improve population causal inference in randomized controlled trials. This paper reviews recent work on transporting causal effect estimates from trials to populations, and proposes estimators using inverse probability weighting or prediction methods. The proposed methods do not require specific functional form or interaction, and can accommodate unequal probability of selection in benchmark or population samples.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2023)