4.5 Review

Statistical Analysis of Zero-Inflated Nonnegative Continuous Data: A Review

Journal

STATISTICAL SCIENCE
Volume 34, Issue 2, Pages 253-279

Publisher

INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/18-STS681

Keywords

Two-part model; Tobit model; health econometrics; semiparametric regression; joint model; cure rate; frailty model; splines

Funding

  1. AHRQ [R01 HS 020263]
  2. NIH/NCI [R01 CA 85848]
  3. NSF [DMS-1308009]

Ask authors/readers for more resources

Zero-inflated nonnegative continuous (or semicontinuous) data arise frequently in biomedical, economical, and ecological studies. Examples include substance abuse, medical costs, medical care utilization, biomarkers (e.g., CD4 cell counts, coronary artery calcium scores), single cell gene expression rates, and (relative) abundance of microbiome. Such data are often characterized by the presence of a large portion of zero values and positive continuous values that are skewed to the right and heteroscedastic. Both of these features suggest that no simple parametric distribution may be suitable for modeling such type of outcomes. In this paper, we review statistical methods for analyzing zero-inflated nonnegative outcome data. We will start with the cross-sectional setting, discussing ways to separate zero and positive values and introducing flexible models to characterize right skewness and heteroscedasticity in the positive values. We will then present models of correlated zero-inflated nonnegative continuous data, using random effects to tackle the correlation on repeated measures from the same subject and that across different parts of the model. We will also discuss expansion to related topics, for example, zero-inflated count and survival data, nonlinear covariate effects, and joint models of longitudinal zero-inflated nonnegative continuous data and survival. Finally, we will present applications to three real datasets (i.e., microbiome, medical costs, and alcohol drinking) to illustrate these methods. Example code will be provided to facilitate applications of these methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Biology

Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix-variate fMRI data

Dong Liu, Changwei Zhao, Yong He, Lei Liu, Ying Guo, Xinsheng Zhang

Summary: This paper proposes a method for clustering and estimating heterogeneous graphs in fMRI data, achieving good results by fully exploiting the group differences of conditional dependence relationships among brain regions. The method constructs individual-level between-region network measures and uses a modified difference of convex programming with the alternating direction method of multipliers (DC-ADMM) algorithm to solve the optimization problem.

BIOMETRICS (2023)

Editorial Material Oncology

Ensuring Employment After Cancer Diagnosis: Are Workable Solutions Obvious?

Cathy J. Bradley, Ya-Chen Tina Shih, K. Robin Yabroff

JOURNAL OF CLINICAL ONCOLOGY (2023)

Article Oncology

Disparity in checkpoint inhibitor utilization among commercially insured adult patients with metastatic lung cancer

Meng Li, Kaiping Liao, Alice J. Chen, Tina Cascone, Yu Shen, Qian Lu, Ya-Chen Tina Shih

Summary: Nationwide, there is evidence to suggest that metastatic lung cancer patients residing in counties with a higher percentage of racialized population experience slower initiation of immune checkpoint inhibitor (ICI) therapy despite having a higher density of medical oncologists in their neighborhood.

JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE (2023)

Article Mathematical & Computational Biology

Refined moderation analysis with categorical outcomes in precision medicine

Xiaogang Su, Youngjoo Cho, Liqiang Ni, Lei Liu, Elise Dusseldorp

Summary: Moderation analysis is crucial in precision medicine research. By exchanging the roles of outcome and treatment variable, equivalent estimation of heterogeneous treatment effects can be achieved in logistic regression models. This study establishes the joint asymptotic normality for the two estimators, enabling refined inference in moderation analysis.

STATISTICS IN MEDICINE (2023)

Article Oncology

Efficacy of a Smoking Cessation Intervention for Survivors of Cervical Intraepithelial Neoplasia or Cervical Cancer: A Randomized Controlled Trial

Jennifer I. Vidrine, Steven K. Sutton, David W. Wetter, Ya-Chen Tina Shih, Lois M. Ramondetta, Linda S. Elting, Joan L. Walker, Katie M. Smith, Summer G. Frank-Pearce, Yisheng Li, Sarah R. Jones, Darla E. Kendzor, Vani N. Simmons, Damon J. Vidrine

Summary: The purpose of this study was to evaluate the long-term efficacy of Motivation And Problem Solving (MAPS), a novel treatment well-suited to meeting the smoking cessation needs of women who smoke and have a history of cervical intraepithelial neoplasia (CIN) or cervical cancer. It was found that MAPS led to a greater than two-fold increase in smoking abstinence among survivors of CIN and cervical cancer at 12 months, but the effect was no longer significant at 18 months.

JOURNAL OF CLINICAL ONCOLOGY (2023)

Editorial Material Oncology

Ecological and individualistic fallacies in health disparities research

Ya-Chen Tina Shih, Cathy Bradley, K. Robin Yabroff

JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE (2023)

Article Oncology

Medicaid expansion, chemotherapy delays, and racial disparities among women with early-stage breast cancer

Mariana Chavez-MacGregor, Xiudong Lei, Catalina Malinowski, Hui Zhao, Ya-Chen Shih, Sharon H. Giordano

Summary: This study used the National Cancer Database to investigate the impact of Medicaid expansion on the timing and delays of adjuvant chemotherapy among early-stage breast cancer patients. The results showed that after Medicaid expansion, the proportion of Black and Hispanic patients experiencing delays in chemotherapy initiation decreased, narrowing the racial disparities.

JNCI-JOURNAL OF THE NATIONAL CANCER INSTITUTE (2023)

Article Oncology

Oncologist Participation and Performance in the Merit-Based Incentive Payment System

Vishal R. Patel, Thomas B. Cwalina, Arjun Gupta, Nico Nortje, Samyukta Mullangi, Ravi B. Parikh, Ya-Chen Tina Shih, S. M. Qasim Hussaini

Summary: In this cross-sectional study, oncologist participation and performance in the 2019 MIPS were examined. Oncologist participation rate was found to be low (86%), compared to the overall participation rate (97%). It was also observed that oncologists using alternative payment models (APMs) as their filing source had higher MIPS scores, indicating the importance of organizational resources for participants.

ONCOLOGIST (2023)

Review Health Care Sciences & Services

An overview of optimal designs under a given budget in cluster randomized trials with a binary outcome

Jingxia Liu, Lei Liu, Aimee S. James, Graham A. Colditz

Summary: Cluster randomized trial design is expensive and it is important to develop an optimal design to minimize costs. Local optimal designs aim to minimize the variance of the treatment effect under a fixed budget. This requires the input of an association parameter and involves the consideration of enrollment feasibility. Our contributions include summarizing available local optimal designs and proposing new designs for different scenarios, as well as developing Statistical Analysis System (SAS) macros for all optimal designs.

STATISTICAL METHODS IN MEDICAL RESEARCH (2023)

Article Mathematical & Computational Biology

Healthcare center clustering for Cox's proportional hazards model by fusion penalty

Lili Liu, Kevin He, Di Wang, Shujie Ma, Annie Qu, Lu Lin, J. Philip Miller, Lei Liu

Summary: There is a growing research interest in evaluating the performance of healthcare centers based on patient outcomes. Conventional assessments can be done using fixed or random effects models, like provider profiling. A new method is proposed using fusion penalty to cluster healthcare centers based on survival outcomes, providing a data-driven approach for grouping without prior knowledge. An efficient algorithm is developed to implement the proposed method, and its validity is demonstrated through simulation studies and application to kidney transplant registry data.

STATISTICS IN MEDICINE (2023)

Article Oncology

Incorporating Cost Measures Into the Merit-Based Incentive Payment System: Implications for Oncologists

Vishal R. Patel, Thomas B. Cwalina, Nico Nortje, Samyukta Mullangi, Ravi B. Parikh, Ya-Chen Tina Shih, Arjun Gupta, S. M. Qasim Hussaini

Summary: The Merit-Based Incentive Payment System (MIPS) is the only federally mandated value-based payment model for oncologists. The inclusion of cost measures in MIPS may disproportionately affect oncologists, who have higher costs of care compared to other specialties. This study examines the implications of incorporating cost measures on physician reimbursements and highlights the need for specialty-specific recalibration to ensure fairness and preserve healthcare quality.

JCO ONCOLOGY PRACTICE (2023)

Article Oncology

Screening for lung cancer: 2023 guideline update from the American Cancer Society

Andrew M. D. Wolf, Kevin C. Oeffinger, Tina Ya-Chen Shih, Louise C. Walter, Timothy R. Church, Elizabeth T. H. Fontham, Elena B. Elkin, Ruth D. Etzioni, Carmen E. Guerra, Rebecca B. Perkins, Karli K. Kondo, Tyler B. Kratzer, Deana Manassaram-Baptiste, William L. Dahut, Robert A. Smith

Summary: Lung cancer is the leading cause of cancer-related deaths and years of life lost in the US. Early detection through screening has been shown to reduce mortality. The American Cancer Society has updated its guidelines for lung cancer screening, recommending annual low-dose CT screening for individuals aged 50-80 who currently smoke or formerly smoked and have a significant smoking history.

CA-A CANCER JOURNAL FOR CLINICIANS (2023)

Editorial Material Oncology

Causal Inference in Oncology Comparative Effectiveness Research Using Observational Data: Are Instrumental Variables Underutilized?

Marcelo Coca Perraillon, Ya-Chen Tina Shih

JOURNAL OF CLINICAL ONCOLOGY (2023)

Article Mathematical & Computational Biology

A flexible quasi-likelihood model for microbiome abundance count data

Yiming Shi, Huilin Li, Chan Wang, Jun Chen, Hongmei Jiang, Ya-Chen T. Shih, Haixiang Zhang, Yizhe Song, Yang Feng, Lei Liu

Summary: In this article, a flexible model for microbiome count data is proposed. The model is based on a quasi-likelihood framework, which does not assume any specific distribution for the microbiome count but assumes the variance as an unknown but smooth function of the mean. Simulation studies demonstrate that the flexible quasi-likelihood method provides valid inferential results compared to the negative binomial generalized linear model (GLM) and Poisson GLM. The utility of the method is further demonstrated using a real microbiome study on the relationship between adenomas and microbiota. An R package, fql, is provided for applying the method.

STATISTICS IN MEDICINE (2023)

Article Mathematical & Computational Biology

Exploring causal mechanisms and quantifying direct and indirect effects using a joint modeling approach for recurrent and terminal events

Fang Niu, Cheng Zheng, Lei Liu

Summary: Recurrent events and terminal events are often related in biomedical studies. Although joint models have been proposed to analyze their correlation, there is a lack of suitable methods to investigate the causal mechanisms between specific exposures, recurrent events, and terminal events.

STATISTICS IN MEDICINE (2023)

No Data Available