nethack

NetHack

3260 papers • 126 benchmarks • 313 datasets

Mean in-game score over 1000 episodes with random seeds not seen during training. See https://arxiv.org/abs/2006.13760 (Section 2.4 Evaluation Protocol) for details.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in nethack

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find nethack models and implementations

facebookresearch/minihack

3 papers 448

Datasets

No datasets available.

Subtasks

NetHack Score

Most implemented papers

The NetHack Learning Environment

Nantas Nardelli, Edward Grefenstette, Heinrich Kuttler, Alexander H. Miller, R. Raileanu, Marco Selvatici, Tim Rocktäschel•Sun May 31 2020

It is argued that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience.

209

Content

0

Paper Graph

BeBold: Exploration Beyond the Boundary of Explored Regions

K. Keutzer, Joseph E. Gonzalez, Xiaolong Wang, Huazhe Xu, Tianjun Zhang, Yuandong Tian, Yi Wu•Mon Dec 14 2020

The regulated difference of inverse visitation counts is proposed as a simple but effective criterion for IR that helps the agent explore Beyond the Boundary of explored regions and mitigates common issues in count-based methods, such as short-sightedness and detachment.

43 0

Paper Graph

CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents

A. Gupta, Roozbeh Mottaghi, Eric Kolve, Eliot Xing, Sam Powers•Mon Oct 18 2021

This work presents CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package, and hopes that the continual RL community can benefit from contributions, to accelerate the development of new continual RL algorithms.

45 0

Paper Graph

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

Mikayel Samvelyan, Edward Grefenstette, Tim Rocktaschel, Jack Parker-Holder, Michael Matthews•Fri Jul 22 2022

The experiments show that learning with a prior knowledge of useful skills can significantly improve the performance of agents on complex problems, and it is argued that utilising predefined skills provides a useful inductive bias for RL problems, especially those with large state-action spaces and sparse rewards.

8 0

Paper Graph

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Mikayel Samvelyan, F. Petroni, Edward Grefenstette, Tim Rocktäschel, Heinrich Küttler, Jack Parker-Holder, Robert Kirk, Eric Hambro, Vitaly Kurin, Minqi Jiang•Sun Sep 26 2021

MiniHack is a one-stop shop for RL experiments with environments ranging from small rooms to complex, procedurally generated worlds, and can wrap existing RL benchmarks and provide ways to seamlessly add additional complexity.

104 0

Paper Graph

SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

Luke Zettlemoyer, Victor Zhong, Austin W. Hanjie, Sida Wang, Karthik Narasimhan•Tue Oct 19 2021

The multi-environment Symbolic Interactive Language Grounding benchmark (SILG) is proposed, which unifies a collection of diverse grounded language learning environments under a common interface and enables the community to quickly identify new methodologies for language grounding that generalize to a diverse set of environments and their associated challenges.

14 0

Paper Graph

NovelD: A Simple yet Effective Exploration Criterion

K. Keutzer, Joseph E. Gonzalez, Xiaolong Wang, Huazhe Xu, Tianjun Zhang, Yuandong Tian, Yi Wu•Thu Dec 31 2020

This paper proposes a simple but effective criterion called NovelD, which solves all the static procedurally-generated tasks in Mini-Grid with just 120 M environment steps, without any curriculum learning and finds that empirically it helps the agent explore the environment more uniformly with a focus on exploring beyond the boundary.

89 0

Paper Graph

Insights From the NeurIPS 2021 NetHack Challenge

S. Mohanty, Mikayel Samvelyan, Nantas Nardelli, A. Kanervisto, Edward Grefenstette, Heinrich Kuttler, R. Raileanu, Tim Rocktaschel, Maciej Sypetkowski, Jack Parker-Holder, Robert Kirk, Eric Hambro, Vitaly Kurin, Dipam Chakraborty, Minqi Jiang, Sungwoong Kim, I. Nazarov, DaeJin Jo, Dmitrii Babaev, Mi-Ra Byeon, Jongmin Kim, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nikita Ovsov, Karolis Ramanauskas, Dan Rothermel, Dmitry Sorokin, Michal Sypetkowski•Mon Mar 21 2022

The challenge served as a direct comparison between neural and symbolic AI, as well as hybrid systems, demonstrating that on NetHack symbolic bots currently outperform deep RL by a large margin.

21 0

Paper Graph

Improving Policy Learning via Language Dynamics Distillation

Edward Grefenstette, Luke Zettlemoyer, Victor Zhong, Tim Rocktaschel, Jesse Mu•Thu Sep 29 2022

This work proposes Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions, and then fine-tunes these language-aware pretrained representations via reinforcement learning (RL).

16 0

Paper Graph

Dungeons and Data: A Large-Scale NetHack Dataset

Naila Murray, R. Raileanu, Tim Rocktäschel, Heinrich Küttler, Eric Hambro, Vegard Mella, Dan Rothermel•Mon Oct 31 2022

A wide range of existing algorithms are evaluated including online and offline RL, as well as learning from demonstrations, showing that significant research advances are needed to fully leverage large-scale datasets for challenging sequential decision making tasks.

21 0

Paper Graph

Adding a benchmark result helps the community track progress.