Acme: A Research Framework for Distributed Reinforcement Learning (2020-06-01T00:00:00.000000Z)

TL;DR

It is shown that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

Abstract

Deep reinforcement learning has led to many recent-and groundbreaking-advancements. However, these advances have often come at the cost of both the scale and complexity of the underlying RL algorithms. Increases in complexity have in turn made it more difficult for researchers to reproduce published RL algorithms or rapidly prototype ideas. To address this, we introduce Acme, a tool to simplify the development of novel RL algorithms that is specifically designed to enable simple agent implementations that can be run at various scales of execution. Our aim is also to make the results of various RL algorithms developed in academia and industrial labs easier to reproduce and extend. To this end we are releasing baseline implementations of various algorithms, created using our framework. In this work we introduce the major design decisions behind Acme and show how these are used to construct these baselines. We also experiment with these agents at different scales of both complexity and computation-including distributed versions. Ultimately, we show that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

TL;DR

Abstract

Authors

References127 items

Magnetic control of tokamak plasmas through deep reinforcement learning

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

Continuous Control with Action Quantization from Demonstrations

Offline RL Without Off-Policy Evaluation

A Minimalist Approach to Offline Reinforcement Learning

Launchpad: A Programming Model for Distributed Machine Learning Research

What Matters for Adversarial Imitation Learning?

Regularized Behavior Value Estimation

Reverb: A Framework For Experience Replay

Autonomous navigation of stratospheric balloons using reinforcement learning

Munchausen Reinforcement Learning

Hyperparameter Selection for Offline Reinforcement Learning

Critic Regularized Regression

Primal Wasserstein Imitation Learning

Conservative Q-Learning for Offline Reinforcement Learning

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)

Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods

Never Give Up: Learning Directed Exploration Strategies

Dota 2 with Large Scale Deep Reinforcement Learning

Grandmaster level in StarCraft II using multi-agent reinforcement learning

TorchBeast: A PyTorch Platform for Distributed RL

Benchmarking Batch Deep Reinforcement Learning Algorithms

Scaling data-driven robotics with reward sketching and batch reinforcement learning

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

Behaviour Suite for Reinforcement Learning

An Optimistic Perspective on Offline Reinforcement Learning

When to use parametric models in reinforcement learning?

Imitation Learning as f-Divergence Minimization

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Soft Actor-Critic Algorithms and Applications

Off-Policy Deep Reinforcement Learning without Exploration

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

Horizon: Facebook's Open Source Applied Reinforcement Learning Platform

SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

Recurrent Experience Replay in Distributed Reinforcement Learning

Dopamine: A Research Framework for Deep Reinforcement Learning

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

Learning dexterous in-hand manipulation

Implicit Quantile Networks for Distributional Reinforcement Learning

Randomized Prior Functions for Deep Reinforcement Learning

Observe and Look Further: Achieving Consistent Performance on Atari

Meta-Gradient Reinforcement Learning

Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

Addressing Function Approximation Error in Actor-Critic Methods

Distributed Prioritized Experience Replay

Distributed Distributional Deterministic Policy Gradients

Maximum a Posteriori Policy Optimisation

Spectral Normalization for Generative Adversarial Networks

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

DeepMind Control Suite

Ray: A Distributed Framework for Emerging AI Applications

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

Rainbow: Combining Improvements in Deep Reinforcement Learning

Overcoming Exploration in Reinforcement Learning with Demonstrations

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

Proximal Policy Optimization Algorithms

A Distributional Perspective on Reinforcement Learning

Deep Q-learning From Demonstrations

Learning from Demonstrations for Real World Reinforcement Learning

Improved Training of Wasserstein GANs

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

FeUdal Networks for Hierarchical Reinforcement Learning

Reinforcement Learning with Unsupervised Auxiliary Tasks

Sample Efficient Actor-Critic with Experience Replay

Generative Adversarial Imitation Learning

Unifying Count-Based Exploration and Intrinsic Motivation

OpenAI Gym

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation