3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in hierarchical-reinforcement-learning-7
Use these libraries to find hierarchical-reinforcement-learning-7 models and implementations
No subtasks available.
A tutorial on Bayesian optimization, a method of finding the maximum of expensive cost functions using the Bayesian technique of setting a prior over the objective function and combining it with evidence to get a posterior function.
This paper studies how to develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.
Results on a number of difficult continuous-control tasks show that the developed notion of sub-optimality of a representation, defined in terms of expected reward of the optimal hierarchical policy using this representation, yields qualitatively better representations as well as quantitatively better hierarchical policies compared to existing methods.
The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges with probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction.
Inspired by the offline consultation process, this work proposes to integrate a hierarchical policy structure of two levels into the dialogue system for policy learning that achieves higher accuracy and symptom recall in disease diagnosis compared with existing systems.
This work proposes a general framework that first learns useful skills in a pre-training environment, and then leverages the acquired skills for learning faster in downstream tasks, and uses Stochastic Neural Networks combined with an information-theoretic regularizer to efficiently pre-train a large span of skills.
This paper presents a novel paradigm to deal with relation extraction by regarding the related entities as the arguments of a relation and applies a hierarchical reinforcement learning (HRL) framework in this paradigm to enhance the interaction between entity mentions and relation types.
This work proposes an unsupervised learning scheme, based on asymmetric self-play from Sukhbaatar et al. (2018), that automatically learns a good representation of sub-goals in the environment and a low-level policy that can execute them.
Adding a benchmark result helps the community track progress.