Journal
NEUROCOMPUTING
Volume 247, Issue -, Pages 144-155Publisher
ELSEVIER
DOI: 10.1016/j.neucom.2017.03.058
Keywords
Deep generative model; Hyperparameter optimization; Sequential model-based optimization; Contrastive divergence
Categories
Funding
- National Basic Research Program of China (973 Program) [2013CB336500]
- Chinese National 863 Program of Demonstration of Digital Medical Service and Technology in Destined Region [2012-AA02A614]
- National Youth Top-notch Talent Support Program
Ask authors/readers for more resources
The performance of many machine learning algorithms depends crucially on the hyperparameter settings, especially in Deep Learning. Manually tuning the hyperparameters is laborious and time consuming. To address this issue, Bayesian optimization (BO) methods and their extensions have been proposed to optimize the hyperparameters automatically. However, they still suffer from highly computational expense when applying to deep generative models (DGMs) due to their strategy of the black-box function optimization. This paper provides a new hyperparameter optimization procedure at the pre-training phase of the DGMs, where we avoid combining all layers as one black-box function by taking advantage of the layer-by-layer learning strategy. Following this procedure, we are able to optimize multiple hyperparameters in an adaptive way by using Gaussian process. In contrast to the traditional BO methods, which mainly focus on the supervised models, the pre-training procedure is unsupervised where there is no validation error can be used. To alleviate this problem, this paper proposes a new holdout loss, the free energy gap, which takes into account both factors of the model fitting and over-fitting. The empirical evaluations demonstrate that our method not only speeds up the process of hyperparameter optimization, but also improves the performances of DGMs significantly in both the supervised and unsupervised learning tasks. (C) 2017 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available