4.6 Article

Multiple imputation in the presence of high-dimensional data

Journal

STATISTICAL METHODS IN MEDICAL RESEARCH
Volume 25, Issue 5, Pages 2021-2035

Publisher

SAGE PUBLICATIONS LTD
DOI: 10.1177/0962280213511027

Keywords

Bayesian lasso regression; high-dimensional data; missing data; multiple imputation; regularized regression

Funding

  1. PCORI [ME-1303-5840]

Ask authors/readers for more resources

Missing data are frequently encountered in biomedical, epidemiologic and social research. It is well known that a naive analysis without adequate handling of missing data may lead to bias and/or loss of efficiency. Partly due to its ease of use, multiple imputation has become increasingly popular in practice for handling missing data. However, it is unclear what is the best strategy to conduct multiple imputation in the presence of high-dimensional data. To answer this question, we investigate several approaches of using regularized regression and Bayesian lasso regression to impute missing values in the presence of high-dimensional data. We compare the performance of these methods through numerical studies, in which we also evaluate the impact of the dimension of the data, the size of the true active set for imputation, and the strength of correlation. Our numerical studies show that in the presence of high-dimensional data the standard multiple imputation approach performs poorly and the imputation approach using Bayesian lasso regression achieves, in most cases, better performance than the other imputation methods including the standard imputation approach using the correctly specified imputation model. Our results suggest that Bayesian lasso regression and its extensions are better suited for multiple imputation in the presence of high-dimensional data than the other regression methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Mathematical & Computational Biology

Assessing predictive accuracy of survival regressions subject to nonindependent censoring

Ming Wang, Qi Long, Chixiang Chen, Lijun Zhang

STATISTICS IN MEDICINE (2020)

Article Biochemical Research Methods

Sparse multiple co-Inertia analysis with application to integrative analysis of multi -Omics data

Eun Jeong Min, Qi Long

BMC BIOINFORMATICS (2020)

Article Otorhinolaryngology

Mental health among otolaryngology resident and attending physicians during the COVID-19 pandemic: National study

Alyssa M. Civantos, Yasmeen Byrnes, Changgee Chang, Aman Prasad, Kevin Chorath, Seerat K. Poonia, Carolyn M. Jenks, Andres M. Bur, Punam Thakkar, Evan M. Graboyes, Rahul Seth, Samuel Trosman, Anni Wong, Benjamin M. Laitman, Brianna N. Harris, Janki Shah, Vanessa Stubbs, Garret Choby, Qi Long, Christopher H. Rassekh, Erica Thaler, Karthik Rajasekaran

HEAD AND NECK-JOURNAL FOR THE SCIENCES AND SPECIALTIES OF THE HEAD AND NECK (2020)

Article Hematology

Recombinant human thrombopoietin promotes platelet engraftment after umbilical cord blood transplantation

Baolin Tang, Lulu Huang, Huilan Liu, Siqi Cheng, Kaidi Song, Xuhan Zhang, Wen Yao, Lijuan Ning, Xiang Wan, Guangyu Sun, Yun Wu, Jiehui Cheng, Qi Long, Zimin Sun, Xiaoyu Zhu

BLOOD ADVANCES (2020)

Article Radiology, Nuclear Medicine & Medical Imaging

Preoperative breast MR imaging in newly diagnosed breast cancer: Comparison of outcomes based on mammographic modality, breast density and breast parenchymal enhancement

Azadeh Elmi, Emily F. Conant, Andrew Kozlov, Anthony J. Young, Qi Long, Robert K. Doot, Elizabeth S. McDonald

Summary: The study findings suggest that women with dense breasts benefit more from preoperative MR in breast cancer patients who undergo digital breast tomosynthesis (DBT) imaging at diagnosis. On the other hand, women imaged only with digital mammography (DM) show additional malignancy detection by MR regardless of breast density.

CLINICAL IMAGING (2021)

Meeting Abstract Oncology

Racial disparities in efficacy of first-line abiraterone in metastatic castrate-resistant prostate cancer (mCRPC).

Mallika Marar, Long Qi, Ronac Mamtani, Vivek Narayan, Neha Vapiwala, Ravi Bharat Parikh

JOURNAL OF CLINICAL ONCOLOGY (2021)

Article Multidisciplinary Sciences

Exploring deep neural networks via layer-peeled model: Minority collapse in imbalanced training

Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

Summary: The Layer-Peeled Model is introduced in this paper as a nonconvex optimization program to better understand deep neural networks. It is shown to inherit characteristics of well-trained neural networks and can help explain and predict common empirical patterns of deep-learning training. The model reveals phenomena such as neural collapse on balanced datasets and Minority Collapse on imbalanced datasets, providing insights into how to mitigate the latter.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Editorial Material Oncology

Addressing Common Misuses and Pitfalls of P values in Biomedical Research

Ming Wang, Qi Long

Summary: In recent years, there has been a growing recognition that P values are often misused or misinterpreted in biomedical research, especially with the emergence of big health data. To address this problem, sound study design and appropriate statistical analysis strategies are needed.

CANCER RESEARCH (2022)

Article Biology

CEDAR: communication efficient distributed analysis for regressions

Changgee Chang, Zhiqi Bu, Qi Long

Summary: Electronic health records (EHRs) provide opportunities for precision medicine, but sharing data is a challenge. We propose a method that aggregates data from external sites by treating it as missing data. We also suggest incorporating posterior samples from remote sites to improve parameter estimates.

BIOMETRICS (2023)

Article Statistics & Probability

Testing Biased Randomization Assumptions and Quantifying Imperfect Matching and Residual Confounding in Matched Observational Studies

Kan Chen, Siyu Heng, Qi Long, Bo Zhang

Summary: One central goal of observational study design is to incorporate non-experimental data into an approximate randomized controlled trial using statistical matching. However, residual imbalance due to imperfect matching of observed covariates often persists. This article presents two generic classes of exact statistical tests for a biased randomization assumption and introduces a quantity called residual sensitivity value (RSV) as a means to quantify the level of residual confounding due to imperfect matching of observed covariates in a matched sample. The proposed methodology is demonstrated through a re-examination of a famous observational study.

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS (2023)

Letter Oncology

High SOX2 expression is associated with poor survival in patients with newly diagnosed multiple myeloma

Xinhe Shan, Qi Long, Alfred L. Garfall, Sandra P. Susanibar-Adaniya

BLOOD CANCER JOURNAL (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Federated f-Differential Privacy

Qinqing Zheng, Shuxiao Chen, Qi Long, Weijie Su

Summary: Federated learning is a training paradigm where clients collaboratively learn models while protecting the privacy of their local sensitive data. This paper introduces federated f-differential privacy and proposes a private federated learning framework PriFedSync which achieves privacy guarantee successfully.

24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS) (2021)

Proceedings Paper Engineering, Biomedical

Deep Multiview Learning to Identify Population Structure with Multimodal Imaging

Yixue Feng, Mansu Kim, Xiaohui Yao, Kefei Liu, Qi Long, Li Shen

2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020) (2020)

Proceedings Paper Computer Science, Artificial Intelligence

Joint Bayesian Variable Selection and Graph Estimation for Non-linear SVM with Application to Genomics Data

Wenli Sun, Changgee Chang, Qi Long

2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020) (2020)

Proceedings Paper Computer Science, Artificial Intelligence

GRIA: Graphical Regularization for Integrative Analysis

Changgee Chang, Jihwan Oh, Qi Long

PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM) (2020)

No Data Available