Summary
Hyperopt: A Python library for optimizing the hyperparameters of machine learning algorithms
Authors: Bergstra, James, University of Waterloo; Yamins, Dan, Massachusetts Institute of Technology; Cox, David D., Harvard University
Track: Machine Learning
Most machine learning algorithms have hyperparameters that have a great impact on end-to-end system performance, and adjusting hyperparameters to optimize end-to-end performance can be a daunting task. Hyperparameters come in many varieties--continuous-valued ones with and without bounds, discrete ones that are either ordered or not, and conditional ones that do not even always apply (e.g., the parameters of an optional pre-processing stage)--so conventional continuous and combinatorial optimization algorithms either do not directly apply, or else operate without leveraging structure in the search space. Typically, the optimization of hyperparameters is carried out before-hand by domain experts on unrelated problems, or manually for the problem at hand with the assistance of grid search. However, even random search has been shown to be competitive [1].
Better hyperparameter optimization algorithms (HOAs) are needed for two reasons:
HOAs formalize the practice of model evaluation, so that benchmarking experiments can be reproduced by different people.
Learning algorithm designers can deliver flexible fully-configurable implementations (of e.g. Deep Learning algorithms) to non-experts, so long as they also provide a corresponding HOA.
Hyperopt provides serial and parallelizable HOAs via a Python library [2, 3]. Fundamental to its design is a protocol for communication between (a) the description of a hyperparameter search space, (b) a hyperparameter evaluation function (machine learning system), and (c) a hyperparameter search algorithm. This protocol makes it possible to make generic HOAs (such as the bundled "TPE" algorithm) work for a range of specific search problems. Specific machine learning algorithms (or algorithm families) are implemented as hyperopt search spaces in related projects: Deep Belief Networks [4], convolutional vision architectures [5], and scikit-learn classifiers [6]. My presentation will explain what problem hyperopt solves, how to use it, and how it can deliver accurate models from data alone, without operator intervention.