4.6 Article

Variable Selection With Prior Information for Generalized Linear Models via the Prior LASSO Method

Journal

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
Volume 111, Issue 513, Pages 355-376

Publisher

AMER STATISTICAL ASSOC
DOI: 10.1080/01621459.2015.1008363

Keywords

Asymptotic efficiency; Oracle inequalities; Solution path; Weak oracle property

Funding

  1. National Institutes of Health (NIH) [R01 DA016750, R01 DA029081]
  2. Wellcome Trust [076113, 085475]

Ask authors/readers for more resources

LASSO is a popular statistical tool often used in conjunction with generalized linear models that can simultaneously select variables and estimate parameters. When there are many variables of interest, as in current biological and biomedical studies, the power of LASSO can be limited. Fortunately, so much biological and biomedical data have been collected and they may contain useful information about the importance of certain variables. This article proposes an extension of LASSO, namely, prior LASSO (pLASSO), to incorporate that prior information into penalized generalized linear models. The goal is achieved by adding in the LASSO criterion function an additional measure of the discrepancy between the prior information and the model. For linear regression, the whole solution path of the pLASSO estimator can be found with a procedure similar to the least angle regression (LARS). Asymptotic theories and simulation results show that pLASSO provides significant improvement over LASSO when the prior information is relatively accurate. When the prior information is less reliable, pLASSO shows great robustness to the misspecification. We illustrate the application of pLASSO using a real dataset from a genome-wide association study. Supplementary materials for this article are available online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available