general-reinforcement-learning-1

Offline RL

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in general-reinforcement-learning-1

Trend

Dataset

Best Model

Actions

D4RL

Walker2d

Libraries

i

Use these libraries to find general-reinforcement-learning-1 models and implementations

zzmtsvv/rl_task

14 papers 35

Datasets

RLU

Visuomotor affordance learning (VAL) robot interaction dataset

Subtasks

DQN Replay Dataset

Most implemented papers

Conservative Q-Learning for Offline Reinforcement Learning

Aurick Zhou, S. Levine, Aviral Kumar, G. Tucker•Sun Jun 07 2020

Conservative Q-learning (CQL) is proposed, which aims to address limitations of offline RL methods by learning a conservative Q-function such that the expected value of a policy under this Q- function lower-bounds its true value.

2263

Content

yihaosun1124/OfflineRL-Kit

8 papers 228

7 papers 387

4 papers 2,539

4 papers 1,197

4 papers 35

idiap/fast-transformers

2 papers 1,570

google-research/batch_rl

2 papers 506

deepmind/rgb_stacking

2 papers 114

haosulab/ManiSkill-Learn

2 papers 59

0

Paper Graph

Decision Transformer: Reinforcement Learning via Sequence Modeling

Aditya Grover, P. Abbeel, A. Srinivas, A. Rajeswaran, Kimin Lee, M. Laskin, Kevin Lu, Igor Mordatch, Lili Chen•Tue Jun 01 2021

Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.

2050 0

Paper Graph

Reformer: The Efficient Transformer

Lukasz Kaiser, Anselm Levskaya, Nikita Kitaev•Sun Jan 12 2020

This work replaces dot-product attention by one that uses locality-sensitive hashing and uses reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of several times, making the model much more memory-efficient and much faster on long sequences.

2758 0

Paper Graph

Offline Reinforcement Learning with Implicit Q-Learning

S. Levine, Ilya Kostrikov, Ashvin Nair•Mon Oct 11 2021

This work proposes an offline RL method that never needs to evaluate actions outside of the dataset, but still enables the learned policy to improve substantially over the best behavior in the data through generalization.

1247 0

Paper Graph

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

S. Levine, Justin Fu, Aviral Kumar, Ofir Nachum, G. Tucker•Tue Apr 14 2020

This work introduces benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL, and releases benchmark tasks and datasets with a comprehensive evaluation of existing algorithms and an evaluation protocol together with an open-source codebase.

1581 0

Paper Graph

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, S. Gu•Fri Jun 11 2021

This paper finds that it can match the performance of state-of-the-art offline RL algorithms by simply adding a behavior cloning term to the policy update of an online RL algorithm and normalizing the data.

1025 0

Paper Graph

Rethinking Attention with Performers

Lucy J. Colwell, Lukasz Kaiser, Jared Davis, Andreea Gane, Afroz Mohiuddin, David Dohan, Adrian Weller, K. Choromanski, David Belanger, Valerii Likhosherstov, Xingyou Song, Tamás Sarlós, Peter Hawkins•Tue Sep 29 2020

Performers, Transformer architectures which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear space and time complexity, without relying on any priors such as sparsity or low-rankness are introduced.

2032 0

Paper Graph

MOPO: Model-based Offline Policy Optimization

Stefano Ermon, Chelsea Finn, Tengyu Ma, S. Levine, Tianhe Yu, James Y. Zou, Lantao Yu, G. Thomas•Tue May 26 2020

A new model-based offline RL algorithm is proposed that applies the variance of a Lipschitz-regularized model as a penalty to the reward function, and it is found that this algorithm outperforms both standard model- based RL methods and existing state-of-the-art model-free offline RL approaches on existing offline RL benchmarks, as well as two challenging continuous control tasks.

873 0

Paper Graph

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Angelos Katharopoulos, Franccois Fleuret, Apoorv Vyas, Nikolaos Pappas•Sun Jun 28 2020

This work expresses the self-attention as a linear dot-product of kernel feature maps and makes use of the associativity property of matrix products to reduce the complexity from O(N) to N, where N is the sequence length.

2404 0

Paper Graph

Acme: A Research Framework for Distributed Reinforcement Learning

Serkan Cabi, Caglar Gulcehre, T. Paine, Nando de Freitas, Ziyun Wang, Fan Yang, Bilal Piot, Gabriel Barth-Maron, A. Abdolmaleki, Sergio Gomez Colmenarejo, Alexander Novikov, Matthew W. Hoffman, John Aslanides, Sarah Henderson, Albin Cassirer, A. Cowie, Bobak Shahriari, Feryal M. P. Behbahani, Tamara Norman, Kate Baumli•Sun May 31 2020

It is shown that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

239 0

Paper Graph

Adding a benchmark result helps the community track progress.