Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges (2021-07-13T00:00:00.000000Z)

TL;DR

Practical recommendations regarding important choices to be made when conducting HPO are given, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.

Abstract

Most machine learning algorithms are configured by a set of hyperparameters whose values must be carefully chosen and which often considerably impact performance. To avoid a time‐consuming and irreproducible manual process of trial‐and‐error to find well‐performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods—for example, based on resampling error estimation for supervised machine learning—can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods, from simple techniques such as grid or random search to more advanced methods like evolution strategies, Bayesian optimization, Hyperband, and racing. This work gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with machine learning pipelines, runtime improvements, and parallelization.

Authors

A. Boulesteix

2 papers

B. Bischl

4 papers

M. Lindauer

6 papers

TL;DR

Abstract

Authors

References275 items

πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization

Automated Benchmark-Driven Design and Explanation of Hyperparameter Optimizers

LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark Suite for Lasso

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO

YAHPO Gym - An Efficient Multi-Objective Multi-Fidelity Benchmark for Hyperparameter Optimization

Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization

HPO-B: A Large-Scale Reproducible Benchmark for Black-Box HPO based on OpenML

Meta-learning for symbolic hyperparameter defaults

OpenBox: A Generalized Black-box Optimization Service

Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020

Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features

Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL

Automated Machine Learning on Graphs: A Survey

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

General-purpose hierarchical optimisation of machine learning pipelines with grammatical evolution

A survey on hyperparameters optimization algorithms of forecasting models in smart grid

A Unifying Framework for Parallel and Distributed Processing in R using Futures

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Does imputation matter? Benchmark for predictive models

Feature Engineering and Selection: A Practical Approach for Predictive Models

Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation

Auto-PyTorch Tabular: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL

Learning Heuristic Selection with Dynamic Algorithm Configuration

Optimization of deep neural networks: a survey and unified taxonomy

A Comparison of Optimization Algorithms for Deep Learning

Survey on categorical data for neural networks

Initial design strategies and their effects on sequential model-based optimization: an exploratory case study based on BBOB

AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits

Trust in AutoML: exploring information needs for establishing trust in automated machine learning systems

mlr3: A modern object-oriented machine learning framework in R

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Greed Is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation

Meta-Learning of Neural Architectures for Few-Shot Learning

Optimizing Millions of Hyperparameters by Implicit Differentiation

Hyperparameter optimization in learning systems

BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search

Scalable Global Optimization via Local Bayesian Optimization

Best Practices for Scientific Research on Neural Architecture Search

Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

AutoML: A Survey of the State-of-the-Art

Optuna: A Next-generation Hyperparameter Optimization Framework

Comparison of Performance of Data Imputation Methods for Numeric Dataset

Gradient based hyperparameter optimization in Echo State Networks

Automated Machine Learning: State-of-The-Art and Open Challenges

Meta-Surrogate Benchmarking for Hyperparameter Optimization

clustermq enables efficient parallelization of genomic analyses

A Framework for Bayesian Optimization in Embedded Subspaces

Quantifying Model Complexity via Functional Decomposition for Better Post-hoc Interpretability

Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly

A Generalized Framework for Population Based Training

Learning multiple defaults for machine learning algorithms

Generalized Linear Models With Examples in R

compboost: Modular Framework for Component-Wise Boosting

Practical Design Space Exploration

GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration

Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

ML-Plan: Automated machine learning via hierarchical planning

Dynamic Control of Explore/Exploit Trade-Off In Bayesian Optimization

Auto-Keras: An Efficient Neural Architecture Search System

DARTS: Differentiable Architecture Search

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

BOCK : Bayesian Optimization with Cylindrical Kernels

Hyperparameters and tuning strategies for random forest

A note on the validity of cross-validation for evaluating autoregressive time series prediction

Tunability: Importance of Hyperparameters of Machine Learning Algorithms

Regularized Evolution for Image Classifier Architecture Search

Efficient benchmarking of algorithm configurators via model-based surrogates

An extensive analysis of the interaction between missing data types, imputation methods, and supervised classifiers

Hyperparameter Importance Across Datasets

Warmstarting of Model-based Algorithm Configuration

Comparison of Kriging-based algorithms for simulation optimization with heterogeneous noise

Learning Transferable Architectures for Scalable Image Recognition