4.4 Article

Comparison of methods for imputing limited-range variables: a simulation study

期刊

BMC MEDICAL RESEARCH METHODOLOGY
卷 14, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/1471-2288-14-57

关键词

Multiple imputation; Limited-range; Skewed data; Missing data; Rounding; Truncated regression

资金

  1. National Health and Medical Research Council [1053609]
  2. Centre of Research Excellence [1035261]
  3. Victorian Government's Operational Infrastructure Support Program
  4. [607400]

向作者/读者索取更多资源

Background: Multiple imputation (MI) was developed as a method to enable valid inferences to be obtained in the presence of missing data rather than to re-create the missing values. Within the applied setting, it remains unclear how important it is that imputed values should be plausible for individual observations. One variable type for which MI may lead to implausible values is a limited-range variable, where imputed values may fall outside the observable range. The aim of this work was to compare methods for imputing limited-range variables, with a focus on those that restrict the range of the imputed values. Methods: Using data from a study of adolescent health, we consider three variables based on responses to the General Health Questionnaire (GHQ), a tool for detecting minor psychiatric illness. These variables, based on different scoring methods for the GHQ, resulted in three continuous distributions with mild, moderate and severe positive skewness. In an otherwise complete dataset, we set 33% of the GHQ observations to missing completely at random or missing at random; repeating this process to create 1000 datasets with incomplete data for each scenario. For each dataset, we imputed values on the raw scale and following a zero-skewness log transformation using: univariate regression with no rounding; post-imputation rounding; truncated normal regression; and predictive mean matching. We estimated the marginal mean of the GHQ and the association between the GHQ and a fully observed binary outcome, comparing the results with complete data statistics. Results: Imputation with no rounding performed well when applied to data on the raw scale. Post-imputation rounding and imputation using truncated normal regression produced higher marginal means than the complete data estimate when data had a moderate or severe skew, and this was associated with under-coverage of the complete data estimate. Predictive mean matching also produced under-coverage of the complete data estimate. For the estimate of association, all methods produced similar estimates to the complete data. Conclusions: For data with a limited range, multiple imputation using techniques that restrict the range of imputed values can result in biased estimates for the marginal mean when data are highly skewed.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Psychology, Clinical

Adolescent antecedents of maternal and paternal perinatal depression: a 36-year prospective cohort

Kimberly C. Thomson, Helena Romaniuk, Christopher J. Greenwood, Primrose Letcher, Elizabeth Spry, Jacqui A. Macdonald, Helena M. McAnally, George J. Youssef, Jennifer McIntosh, Delyse Hutchinson, Robert J. Hancox, George C. Patton, Craig A. Olsson

Summary: For the majority of parents, perinatal depression is a continuation of mental health problems that began well before pregnancy. Strategies to promote good perinatal mental health should start before parenthood and include both men and women.

PSYCHOLOGICAL MEDICINE (2021)

Article Psychiatry

Do risk factors for adolescent internalising difficulties differ depending on childhood internalising experiences?

Meredith O'Connor, Helena Romaniuk, Sarah Gray, Galina Daraganova

Summary: This study found a continuity of internalising difficulties from childhood to adolescence, with a higher risk of adolescent internalising problems for those who had experienced internalising symptoms in childhood. Other known risk factors were also associated with adolescent internalising problems.

SOCIAL PSYCHIATRY AND PSYCHIATRIC EPIDEMIOLOGY (2021)

Review Health Care Sciences & Services

A scoping review of studies using observational data to optimise dynamic treatment regimens

Robert K. Mahar, Myra B. McGuinness, Bibhas Chakraborty, John B. Carlin, Maarten J. IJzerman, Julie A. Simpson

Summary: Observational DTR models are a recent development, primarily applied in areas such as HIV/AIDS, cancer, and diabetes. Various statistical methods are used in these studies, including inverse-probability weighting, the parametric G-formula, and Q-learning. Studies are generally categorized into those focusing on real-world clinical questions and methodological developments, with the former tending to use well-established statistical methods.

BMC MEDICAL RESEARCH METHODOLOGY (2021)

Article Medicine, General & Internal

Seroprevalence of SARS-CoV-2-specific antibodies in Sydney after the first epidemic wave of 2020

Heather F. Gidding, Dorothy A. Machalek, Alexandra J. Hendry, Helen E. Quinn, Kaitlyn Vette, Frank H. Beard, Hannah S. Shilling, Rena Hirani, Iain B. Gosbell, David O. Irving, Linda Hueston, Marnie Downes, John B. Carlin, Matthew V. N. O'Sullivan, Dominic E. Dwyer, John M. Kaldor, Kristine Macartney

Summary: The study estimated SARS-CoV-2-specific antibody seroprevalence in Sydney after the first epidemic wave of COVID-19, with results showing a prevalence below 1%, indicating low community transmission. Early control measures were successful in limiting the spread of COVID-19, but ongoing efforts to reduce transmission are still crucial.

MEDICAL JOURNAL OF AUSTRALIA (2021)

Article Pediatrics

Growth and adrenarche: findings from the CATS observational study

Anne-Lise Goddings, Russell M. Viner, Lisa Mundy, Helena Romaniuk, Charlotte Molesworth, John B. Carlin, Nicholas B. Allen, George C. Patton

Summary: The study reveals that anthropometric measures are positively associated with salivary androgen concentrations in pre-adolescent children, with overweight or obese individuals showing higher testosterone and DHEA concentrations. Obese individuals are more likely to have higher androgen levels compared to normal weight individuals of the same age group.

ARCHIVES OF DISEASE IN CHILDHOOD (2021)

Article Mathematical & Computational Biology

When should matching be used in the design of cluster randomized trials?

Patty Chondros, Obioha C. Ukoumunne, Jane M. Gunn, John B. Carlin

Summary: Simulation was used to compare the efficiency of matched-pair design, stratified design, and simple design in cluster randomized trials. Results showed that matched-pair design was generally the most efficient when the matching correlation was moderate to strong, while stratified design and simple design were more efficient for weak matching correlations.

STATISTICS IN MEDICINE (2021)

Article Mathematical & Computational Biology

Multiple imputation of semi-continuous exposure variables that are categorized for analysis

Cattram D. Nguyen, Margarita Moreno-Betancur, Laura Rodwell, Helena Romaniuk, John B. Carlin, Katherine J. Lee

Summary: Semi-continuous variables have unique characteristics and various methods for imputation of missing values. Direct imputation of categories or deriving categories after imputation performed well, while methods requiring rounding showed poor performance. The parameter of interest should be considered when selecting an imputation procedure.

STATISTICS IN MEDICINE (2021)

Article Mathematical & Computational Biology

Evaluation of approaches for accommodating interactions and non-linear terms in multiple imputation of incomplete three-level data

Rushani Wijesuriya, Margarita Moreno-Betancur, John B. Carlin, Anurika P. De Silva, Katherine J. Lee

Summary: Three-level data structures in health research studies often have missing data, which are addressed with multiple imputation approaches. Various methods can be used to account for the three-level structure in substantive analysis models, particularly when interactions or quadratic effects are involved. The substantive model compatible MI has shown promise in single-level data, but there are limited approaches for incomplete three-level data.

BIOMETRICAL JOURNAL (2022)

Article Allergy

Modifiable factors associated with pediatric asthma readmissions: a multi-center linked cohort study

Katherine Y. H. Chen, Wanyu Chu, Renee Jones, Peter Vuillermin, David Fuller, David Tran, Lena Sanci, Shivanthan Shanthikumar, John Carlin, Harriet Hiscock

Summary: This study examined the rates of hospital readmission and emergency department re-presentation for asthma in Australian children. It also explored the effects of modifiable factors on hospital readmission, including the role of general practitioners and home environmental factors. The findings suggest that hospital readmissions for asthma are increasing among Australian children, and highlight the important role of general practitioners in managing pediatric asthma. There was no apparent association between hospital or home environmental factors and hospital readmissions.

JOURNAL OF ASTHMA (2023)

Article Cardiac & Cardiovascular Systems

Impact of a heart failure nurse practitioner service on rehospitalizations, emergency presentations, and survival in patients hospitalized with acute heart failure

Andrea Driscoll, Sharon Meagher, Rhoda Kennedy, David L. Hare, Douglas F. Johnson, Kristina Asker, Omar Farouque, Helena Romaniuk, Liliana Orellana

Summary: This study explored the impact of inpatient HF NP (heart failure nurse practitioners) service on the 12-month rehospitalization, emergency department presentations, and mortality in patients with heart failure. The results showed that patients who received HF NP service had a lower risk of rehospitalization and ED presentations, and improved referrals to a home visiting program.

EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING (2023)

Article Allergy

Primary health care utilization and hospital readmission in children with asthma: a multi-site linked data cohort study

Katherine Y. H. Chen, Renee Jones, Shaoke Lei, Shivanthan Shanthikumar, Lena Sanci, John Carlin, Harriet Hiscock

Summary: This study investigated primary health care utilization among 767 children with asthma and examined the effect of primary care factors on asthma hospital readmission. The results showed that primary care use by children with asthma was often irregular and lacked continuity. Increased frequency of visits was associated with reduced readmissions and emergency department presentations.

JOURNAL OF ASTHMA (2023)

Article Health Care Sciences & Services

Should multiple imputation be stratified by exposure group when estimating causal effects via outcome regression in observational studies?

Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur

Summary: Despite recent advances in causal inference methods, outcome regression remains the most widely used approach for estimating causal effects in epidemiological studies with a single-point exposure and outcome. Missing data are common in these studies, and complete-case analysis (CCA) and multiple imputation (MI) are two frequently used methods for handling them. However, it is unclear whether MI should be conducted by exposure group in observational studies.

BMC MEDICAL RESEARCH METHODOLOGY (2023)

Article Public, Environmental & Occupational Health

Assumptions and analysis planning in studies with missing data in multiple variables: moving beyond the MCAR/MAR/MNAR classification

Katherine J. Lee, John B. Carlin, Julie A. Simpson, Margarita Moreno-Betancur

Summary: Researchers are advised to classify their missing data as MCAR, MAR, or MNAR when analyzing the data. However, the original classification by Rubin in the 1970s has two major problems. First, it is difficult to assess the plausibility of the MAR assumption when there are missing data in multiple variables. Second, MCAR and MAR are not necessary conditions for consistent estimation, so the classification does not determine the best approach for handling missing data.

INTERNATIONAL JOURNAL OF EPIDEMIOLOGY (2023)

Article Multidisciplinary Sciences

Evaluation of the introduction of a healthy food and drink policy in 13 community recreation centres on the healthiness and nutrient content of customer purchases and business outcomes: An observational study

Shaan Stephanie Naughton, Helena Romaniuk, Anna Peeters, Alexandra Chung, Alethea Jerebine, Liliana Orellana, Tara Boelsen-Robinson

Summary: This observational study assessed the introduction of a comprehensive healthy food and drink policy and its impact on business outcomes and the healthiness of purchases. The implementation of the policy resulted in a shift towards healthier purchases, with a decrease in the sales of unhealthy drinks and an increase in the sales of healthier options. The study highlights the importance of policies to improve the health of retail food environments.

PLOS ONE (2023)

Article Critical Care Medicine

Characteristics and outcomes of children receiving intensive care therapy within 12 hours following a medical emergency team event

Ben Gelbart, Suzanna Vidmar, David Stephens, Daryl Cheng, Jenny Thompson, Ahuva Segal, Tali Gadish, John Carlin

Summary: Approximately one-fifth of MET events resulted in intensive care admission, and nearly half of these patients required ICT within 12 hours. Patients requiring ICT had longer duration of respiratory support, intensive care and hospital length of stay, and increased mortality. Age < 1 year and experiencing a critical event increased the risk of requiring ICT.

CRITICAL CARE AND RESUSCITATION (2021)

暂无数据