Hyperparameter Optimization of Deep Neural Networks Using Non-Probabilistic RBF Surrogate Model
Ilija Ilievski
and
Taimoor Akhtar
and
Jiashi Feng
and
Christine Annette Shoemaker
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.AI, cs.LG, stat.ML
First published: 2016/07/28 (8 years ago) Abstract: Recently, Bayesian optimization has been successfully applied for optimizing
hyperparameters of deep neural networks, significantly outperforming the
expert-set hyperparameter values. The methods approximate and minimize the
validation error as a function of hyperparameter values through probabilistic
models like Gaussian processes. However, probabilistic models that require a
prior distribution of the errors may be not adequate for approximating very
complex error functions of deep neural networks. In this work, we propose to
employ radial basis function as the surrogate of the error functions for
optimizing both continuous and integer hyperparameters. The proposed
non-probabilistic algorithm, called Hyperparameter Optimization using RBF and
DYCORS (HORD), searches the surrogate for the most promising hyperparameter
values while providing a good balance between exploration and exploitation.
Extensive evaluations demonstrate HORD significantly outperforms the
well-established Bayesian optimization methods such as Spearmint and TPE, both
in terms of finding a near optimal solution with fewer expensive function
evaluations, and in terms of a final validation error. Further, HORD performs
equally well in low- and high-dimensional hyperparameter spaces, and by
avoiding expensive covariance computation can also scale to a high number of
observations.