offline-rl-1

DQN Replay Dataset

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in offline-rl-1

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find offline-rl-1 models and implementations

google-research/batch_rl

2 papers 510

Datasets

No datasets available.

Subtasks

No subtasks available.

Most implemented papers

Conservative Q-Learning for Offline Reinforcement Learning

Aurick Zhou, S. Levine, Aviral Kumar, G. Tucker•Sun Jun 07 2020

Conservative Q-learning (CQL) is proposed, which aims to address limitations of offline RL methods by learning a conservative Q-function such that the expected value of a policy under this Q- function lower-bounds its true value.

2263

Content

0

Paper Graph

Acme: A Research Framework for Distributed Reinforcement Learning

Serkan Cabi, Caglar Gulcehre, T. Paine, Nando de Freitas, Ziyun Wang, Fan Yang, Bilal Piot, Gabriel Barth-Maron, A. Abdolmaleki, Sergio Gomez Colmenarejo, Alexander Novikov, Matthew W. Hoffman, John Aslanides, Sarah Henderson, Albin Cassirer, A. Cowie, Bobak Shahriari, Feryal M. P. Behbahani, Tamara Norman, Kate Baumli•Sun May 31 2020

It is shown that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

239 0

Paper Graph

RL Unplugged: A Suite of Benchmarks for Ofﬂine Reinforcement Learning

Mohammad Norouzi, Rishabh Agarwal, D. Mankowitz, Caglar Gulcehre, T. Paine, Nando de Freitas, N. Heess, Ziyun Wang, Ofir Nachum, G. Tucker, Gabriel Dulac-Arnold, J. Merel, Sergio Gomez Colmenarejo, Alexander Novikov, Konrad Zolna, J. Li, Cosmin Paduraru, Matthew W. Hoffman•Tue Dec 31 2019

This paper proposes a benchmark called RL Unplugged to evaluate and compare ofﬂine RL methods, and proposes detailed evaluation protocols for each domain in RL Unplugged and provides an extensive analysis of supervised learning and ofﬂine RL methods using these protocols.

135 0

Paper Graph

Revisiting Fundamentals of Experience Replay

H. Larochelle, Yoshua Bengio, Will Dabney, W. Fedus, Prajit Ramachandran, Mark Rowland, Rishabh Agarwal•Sat Jul 11 2020

This work presents a systematic and extensive analysis of experience replay in Q-learning methods, focusing on two fundamental properties: the replay capacity and the ratio of learning updates to experience collected (replay ratio).

287 0

Paper Graph

An Optimistic Perspective on Offline Reinforcement Learning

Mohammad Norouzi, Rishabh Agarwal, D. Schuurmans•Tue Jul 09 2019

It is demonstrated that recent off-policy deep RL algorithms, even when trained solely on this replay dataset, outperform the fully trained DQN agent and Random Ensemble Mixture (REM), a robust Q-learning algorithm that enforces optimal Bellman consistency on random convex combinations of multiple Q-value estimates is presented.

545 0

Paper Graph

Adding a benchmark result helps the community track progress.