3260 papers • 126 benchmarks • 313 datasets
Hyperparameter Optimization is the problem of choosing a set of optimal hyperparameters for a learning algorithm. Whether the algorithm is suitable for the data directly depends on hyperparameters, which directly influence overfitting or underfitting. Each model requires different assumptions, weights or training speeds for different types of data under the conditions of a given loss function. Source: Data-driven model for fracturing design optimization: focus on building digital database and production forecast
(Image credit: Papersgraph)
These leaderboards are used to track progress in hyperparameter-optimization-4
Use these libraries to find hyperparameter-optimization-4 models and implementations
No subtasks available.
A novel algorithm is introduced, Hyperband, for hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem where a predefined resource like iterations, data samples, or features is allocated to randomly sampled configurations.
A tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions using the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function.
New design-criteria for next-generation hyperparameter optimization software are introduced, including define-by-run API that allows users to construct the parameter search space dynamically, and easy-to-setup, versatile architecture that can be deployed for various purposes.
A neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures finds that most models can reach similar scores with enough hyperparameter optimization and random restarts, suggesting that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes.
An algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations is proposed and used to train modern network architectures with millions of weights and millions of hyper-parameters.
This tutorial describes how Bayesian optimization works, including Gaussian process regression and three common acquisition functions: expected improvement, entropy search, and knowledge gradient, and provides a generalization of expected improvement to noisy evaluations beyond the noise-free setting where it is more commonly applied.
This paper evaluates the importance of different network design choices and hyperparameters for five common linguistic sequence tagging tasks and found, that some parameters, like the pre-trained word embeddings or the last layer of the network, have a large impact on the performance, while other parameters, for example the number of LSTM layers or theNumber of recurrent units, are of minor importance.
This work introduces a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameters optimization problems, and shows that ASHA outperforms existing state-of-the-art hyper parameter optimization methods.
This work has undergone no intensive hyperparameter optimization and lived entirely on a commodity desktop machine that made the author's small studio apartment far too warm in the midst of a San Franciscan summer.
Fidelity Ensemble Surrogate (MFES) is proposed, an efficient Hyperband method that is capable of utilizing both the high-fidelity and low-f fidelity measurements to accelerate the convergence of HPO tasks.
Adding a benchmark result helps the community track progress.