3260 papers • 126 benchmarks • 313 datasets
Mean in-game score over 1000 episodes with random seeds not seen during training. See https://arxiv.org/abs/2006.13760 (Section 2.4 Evaluation Protocol) for details.
(Image credit: Papersgraph)
These leaderboards are used to track progress in nethack
No benchmarks available.
Use these libraries to find nethack models and implementations
No datasets available.
It is argued that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience.
The regulated difference of inverse visitation counts is proposed as a simple but effective criterion for IR that helps the agent explore Beyond the Boundary of explored regions and mitigates common issues in count-based methods, such as short-sightedness and detachment.
This work presents CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package, and hopes that the continual RL community can benefit from contributions, to accelerate the development of new continual RL algorithms.
The experiments show that learning with a prior knowledge of useful skills can significantly improve the performance of agents on complex problems, and it is argued that utilising predefined skills provides a useful inductive bias for RL problems, especially those with large state-action spaces and sparse rewards.
MiniHack is a one-stop shop for RL experiments with environments ranging from small rooms to complex, procedurally generated worlds, and can wrap existing RL benchmarks and provide ways to seamlessly add additional complexity.
The multi-environment Symbolic Interactive Language Grounding benchmark (SILG) is proposed, which unifies a collection of diverse grounded language learning environments under a common interface and enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.
This paper proposes a simple but effective criterion called NovelD, which solves all the static procedurally-generated tasks in Mini-Grid with just 120 M environment steps, without any curriculum learning and finds that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.
The challenge served as a direct comparison between neural and symbolic AI, as well as hybrid systems, demonstrating that on NetHack symbolic bots currently outperform deep RL by a large margin.
This work proposes Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions, and then fine-tunes these language-aware pretrained representations via reinforcement learning (RL).
A wide range of existing algorithms are evaluated including online and offline RL, as well as learning from demonstrations, showing that significant research advances are needed to fully leverage large-scale datasets for challenging sequential decision making tasks.
Adding a benchmark result helps the community track progress.