4.0 Article

Penalized Logistic Regression Analysis for Genetic Association Studies of Binary Phenotypes

期刊

HUMAN HEREDITY
卷 87, 期 3-4, 页码 69-86

出版社

KARGER
DOI: 10.1159/000525650

关键词

Rare genetic variants; Penalized logistic regression; log-F priors; Monte Carlo EM; Laplace approximation; Data augmentation

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC)
  2. Canadian Institute of Health Research
  3. Lotte and John Hecht Memorial Foundation
  4. Canadian Cancer Society
  5. [RGPIN-05595]

向作者/读者索取更多资源

We propose a method for single rare variant analysis with binary phenotypes using logistic regression penalized by log-F priors. Our two-step approach involves estimating a shrinkage parameter m and then conducting log-F-penalized logistic regression analyses of all variants using the estimated m. Simulation studies and application to real data demonstrate that our method achieves lower bias and mean squared error compared to other methods.
Introduction: Increasingly, logistic regression methods for genetic association studies of binary phenotypes must be able to accommodate data sparsity, which arises from unbalanced case-control ratios and/or rare genetic variants. Sparseness leads to maximum likelihood estimators (MLEs) of log-OR parameters that are biased away from their null value of zero and tests with inflated type I errors. Different penalized likelihood methods have been developed to mitigate sparse data bias. We study penalized logistic regression using a class of log-F priors indexed by a shrinkage parameter m to shrink the biased MLE toward zero. Methods: We proposed a two-step approach to the analysis of a genetic association study: first, a set of variants that show evidence of association with the trait is used to estimate m; second, the estimated m is used for log-F-penalized logistic regression analyses of all variants using data augmentation with standard software. Our estimate of m is the maximizer of a marginal likelihood obtained by integrating the latent log-ORs out of the joint distribution of the parameters and observed data. We consider two approximate approaches to maximizing the marginal likelihood: (i) a Monte Carlo EM algorithm and (ii) a Laplace approximation to each integral, followed by derivative-free optimization of the approximation. Results: We evaluated the statistical properties of our proposed two-step method and compared its performance to other shrinkage methods by a simulation study. Our simulation studies suggest that the proposed log-F-penalized approach has lower bias and mean squared error than other methods considered. We also illustrated the approach on data from a study of genetic associations with Super Senior cases and middle-aged controls. Discussion/Conclusion: We have proposed a method for single rare variant analysis with binary phenotypes by logistic regression penalized by log-F priors. Our method has the advantage of being easily extended to correct for confounding due to population structure and genetic relatedness through a data augmentation approach.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.0
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据