Article
Multidisciplinary Sciences
Faisal Maqbool Zahid, Shahla Faisal, Christian Heumann
Summary: In high-dimensional settings, Multiple Imputation (MI) is challenging, a semi-compatible imputation model is proposed by relaxing the lasso penalty and using a ridge penalty to address instability and convergence issues. The proposed approach shows superior performance to existing MI techniques in simulation studies and real-life datasets while addressing compatibility problems.
Article
Health Care Sciences & Services
Lauren J. Beesley, Irina Bondarenko, Michael R. Elliot, Allison W. Kurian, Steven J. Katz, Jeremy M. G. Taylor
Summary: This paper describes how to generalize the sequential regression multiple imputation procedure to handle non-random missingness when missingness may depend on other variables. The method reduces bias in the final analysis compared to standard techniques, using approximation strategies involving inclusion of an offset in the imputation model.
STATISTICAL METHODS IN MEDICAL RESEARCH
(2021)
Article
Mathematics
Xiaoning Li, Mulati Tuerde, Xijian Hu
Summary: This paper investigates the application of quantile regression models from a Bayesian perspective, proposing a hierarchical model framework and using Bayesian methods to handle missing data. The research findings demonstrate the significant advantages of the proposed methodology in both simulation and real data analysis.
Article
Mathematics
Fangfang Li, Hui Sun, Yu Gu, Ge Yu
Summary: This paper proposes a noise-aware missing data multiple imputation algorithm NPMI for static data. Different multiple imputation models are proposed according to the missing mechanism of data. The method to determine the imputation order of multivariablesmissing is given. Experiments on real and synthetic datasets verify the accuracy and efficiency of the proposed algorithm.
Article
Statistics & Probability
Jialu Li, Guan Yu, Qizhai Li, Yufeng Liu
Summary: Modern high-dimensional statistical inference often faces the problem of missing data. In this article, we propose a new method called SCOM to deal with missing data occurring in predictors. SCOM makes full use of all available data and is robust with respect to various missing mechanisms.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2023)
Article
Psychology, Multidisciplinary
Heather J. Gunn, Panteha Hayati Rezvan, M. Isabel Fernandez, W. Scott Comulada
Summary: Psychological researchers often use standard linear regression to identify relevant predictors of an outcome of interest. Regularization methods like the LASSO can mitigate overfitting, increase interpretability, and improve prediction. However, handling missing data when using regularization-based variable selection methods is complicated. This tutorial describes three approaches for fitting a LASSO when using multiple imputation to handle missing data and highlights the need for additional research on best practices.
PSYCHOLOGICAL METHODS
(2023)
Article
Mathematical & Computational Biology
Yusuke Yamaguchi, Satoshi Yoshida, Toshihiro Misumi, Kazushi Maruo
Summary: Multiple imputation is a promising approach for handling missing data in longitudinal clinical studies, particularly when incorporating informative auxiliary variables. The Bayesian lasso imputation model demonstrated superior performance in simulation studies, providing unbiased treatment effect estimates and higher statistical power compared to conventional methods. Ignoring informative auxiliary variables can lead to serious bias and inflated type I error rates.
STATISTICS IN MEDICINE
(2022)
Article
Computer Science, Interdisciplinary Applications
Jingmao Li, Qingzhao Zhang, Song Chen, Kuangnan Fang
Summary: In this article, a novel weighted multiple blockwise imputation method is proposed to address the problem of high-dimensional regression with blockwise missing data. The method demonstrates superior performance in variable selection, parameter estimation, and prediction ability.
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION
(2023)
Article
Management
Liyuan Cui, Yongmiao Hong, Yingxing Li, Junhui Wang
Summary: This paper proposes a novel large-dimensional positive definite covariance estimator for high-frequency data and achieves good performance in the presence of microstructure noises and asynchronous trading.
MANAGEMENT SCIENCE
(2023)
Article
Biochemical Research Methods
Juan D. Henao, Michael Lauber, Manuel Azevedo, Anastasiia Grekova, Fabian Theis, Markus List, Christoph Ogris, Benjamin Schubert
Summary: This study integrated regression-based methods that can handle missingness into KiMONo, and benchmarked their performance on commonly encountered missing data scenarios in single- and multi-omics studies. The results showed that two-step approaches that explicitly handle missingness performed best for imbalanced omics-layers dimensions, while methods implicitly handling missingness performed best for balanced omics-layers dimensions. The study demonstrated the feasibility of robust multi-omics network inference in the presence of missing data with KiMONo.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biochemical Research Methods
Ayyuce Begum Bektas, Cigdem Ak, Mehmet Gonen
Summary: With the increasing sizes of computational biology datasets, previous kernel-based machine learning algorithms have failed to provide satisfactory interpretability. To address this issue, we propose a fast and efficient multiple kernel learning algorithm that can extract significant information from genomic data. Our experiments demonstrate that the algorithm outperforms baseline methods while using only a small fraction of input features, and it has the potential to discover new biomarkers and therapeutic guidelines.
Article
Statistics & Probability
Alberto Brini, Edwin R. van den Heuvel
Summary: This article explores the imputation of missing data in high-dimensional datasets and compares different approaches using a linear mixed modeling framework. The recursive partitioning and predictive mean matching algorithm show superiority in terms of bias, mean squared error, and coverage of parameter estimates.
AMERICAN STATISTICIAN
(2023)
Article
Mathematics
Zhongzheng Wang, Guangming Deng, Jianqi Yu
Summary: The proposed group screening procedure based on the information gain ratio for a classification model is shown to have better screening performance and classification accuracy.
JOURNAL OF MATHEMATICS
(2022)
Article
Biology
Lauren Hoskovec, Matthew D. Koslovsky, Kirsten Koehler, Nicholas Good, Jennifer L. Peel, John Volckens, Ander Wilson
Summary: This paper presents an infinite hidden Markov model for multiple asynchronous multivariate time series with missing data. The model excels in estimating hidden states and imputing missing data through beam sampling and Bayesian multiple imputation algorithm. The model performs well in simulation studies and real-case validation, showing improvements in estimation and imputation compared to existing approaches.
Article
Urology & Nephrology
Katrina Blazek, Anita van Zwieten, Valeria Saglimbene, Armando Teixeira-Pinto
Summary: Health data often have missing values, and utilizing multiple imputation techniques can help reduce bias and maintain sample size. Correct specification of the imputation model is crucial for the validity of analyses. Considerations such as missing mechanism, imputation method, and result reporting are important when conducting research with multiply imputed data.
KIDNEY INTERNATIONAL
(2021)
Article
Mathematical & Computational Biology
Ming Wang, Qi Long, Chixiang Chen, Lijun Zhang
STATISTICS IN MEDICINE
(2020)
Article
Biochemical Research Methods
Eun Jeong Min, Qi Long
BMC BIOINFORMATICS
(2020)
Article
Otorhinolaryngology
Alyssa M. Civantos, Yasmeen Byrnes, Changgee Chang, Aman Prasad, Kevin Chorath, Seerat K. Poonia, Carolyn M. Jenks, Andres M. Bur, Punam Thakkar, Evan M. Graboyes, Rahul Seth, Samuel Trosman, Anni Wong, Benjamin M. Laitman, Brianna N. Harris, Janki Shah, Vanessa Stubbs, Garret Choby, Qi Long, Christopher H. Rassekh, Erica Thaler, Karthik Rajasekaran
HEAD AND NECK-JOURNAL FOR THE SCIENCES AND SPECIALTIES OF THE HEAD AND NECK
(2020)
Article
Hematology
Baolin Tang, Lulu Huang, Huilan Liu, Siqi Cheng, Kaidi Song, Xuhan Zhang, Wen Yao, Lijuan Ning, Xiang Wan, Guangyu Sun, Yun Wu, Jiehui Cheng, Qi Long, Zimin Sun, Xiaoyu Zhu
Article
Radiology, Nuclear Medicine & Medical Imaging
Azadeh Elmi, Emily F. Conant, Andrew Kozlov, Anthony J. Young, Qi Long, Robert K. Doot, Elizabeth S. McDonald
Summary: The study findings suggest that women with dense breasts benefit more from preoperative MR in breast cancer patients who undergo digital breast tomosynthesis (DBT) imaging at diagnosis. On the other hand, women imaged only with digital mammography (DM) show additional malignancy detection by MR regardless of breast density.
Meeting Abstract
Oncology
Mallika Marar, Long Qi, Ronac Mamtani, Vivek Narayan, Neha Vapiwala, Ravi Bharat Parikh
JOURNAL OF CLINICAL ONCOLOGY
(2021)
Article
Multidisciplinary Sciences
Cong Fang, Hangfeng He, Qi Long, Weijie J. Su
Summary: The Layer-Peeled Model is introduced in this paper as a nonconvex optimization program to better understand deep neural networks. It is shown to inherit characteristics of well-trained neural networks and can help explain and predict common empirical patterns of deep-learning training. The model reveals phenomena such as neural collapse on balanced datasets and Minority Collapse on imbalanced datasets, providing insights into how to mitigate the latter.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2021)
Editorial Material
Oncology
Ming Wang, Qi Long
Summary: In recent years, there has been a growing recognition that P values are often misused or misinterpreted in biomedical research, especially with the emergence of big health data. To address this problem, sound study design and appropriate statistical analysis strategies are needed.
Article
Biology
Changgee Chang, Zhiqi Bu, Qi Long
Summary: Electronic health records (EHRs) provide opportunities for precision medicine, but sharing data is a challenge. We propose a method that aggregates data from external sites by treating it as missing data. We also suggest incorporating posterior samples from remote sites to improve parameter estimates.
Article
Statistics & Probability
Kan Chen, Siyu Heng, Qi Long, Bo Zhang
Summary: One central goal of observational study design is to incorporate non-experimental data into an approximate randomized controlled trial using statistical matching. However, residual imbalance due to imperfect matching of observed covariates often persists. This article presents two generic classes of exact statistical tests for a biased randomization assumption and introduces a quantity called residual sensitivity value (RSV) as a means to quantify the level of residual confounding due to imperfect matching of observed covariates in a matched sample. The proposed methodology is demonstrated through a re-examination of a famous observational study.
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS
(2023)
Letter
Oncology
Xinhe Shan, Qi Long, Alfred L. Garfall, Sandra P. Susanibar-Adaniya
BLOOD CANCER JOURNAL
(2023)
Proceedings Paper
Computer Science, Artificial Intelligence
Qinqing Zheng, Shuxiao Chen, Qi Long, Weijie Su
Summary: Federated learning is a training paradigm where clients collaboratively learn models while protecting the privacy of their local sensitive data. This paper introduces federated f-differential privacy and proposes a private federated learning framework PriFedSync which achieves privacy guarantee successfully.
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS)
(2021)
Proceedings Paper
Engineering, Biomedical
Yixue Feng, Mansu Kim, Xiaohui Yao, Kefei Liu, Qi Long, Li Shen
2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020)
(2020)
Proceedings Paper
Computer Science, Artificial Intelligence
Wenli Sun, Changgee Chang, Qi Long
2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020)
(2020)
Proceedings Paper
Computer Science, Artificial Intelligence
Changgee Chang, Jihwan Oh, Qi Long
PROCEEDINGS OF THE 2020 SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM)
(2020)