Solving Rubik's Cube with a Robot Hand (2019-10-16T00:00:00.000000Z)

TL;DR

It is demonstrated that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot, made possible by a novel algorithm, which is called automatic domain randomization (ADR), and a robot platform built for machine learning.

Abstract

We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain randomization (ADR) and a robot platform built for machine learning. ADR automatically generates a distribution over randomized environments of ever-increasing difficulty. Control policies and vision state estimators trained with ADR exhibit vastly improved sim2real transfer. For control policies, memory-augmented models trained on an ADR-generated distribution of environments show clear signs of emergent meta-learning at test time. The combination of ADR with our custom robot platform allows us to solve a Rubik's cube with a humanoid robot hand, which involves both control and state estimation problems. Videos summarizing our results are available: this https URL

Authors

Jonas Schneider

4 papers

Wojciech Zaremba

9 papers

A. Paino

3 papers

TL;DR

Abstract

Authors

References126 items

Deep Dynamics Models for Learning Dexterous Manipulation

Reinforcement learning

Learning to Solve a Rubik’s Cube with a Dexterous Hand

POET: open-ended coevolution of environments and their optimized solutions

A Model-based Approach for Sample-efficient Multi-task Reinforcement Learning

ORRB - OpenAI Remote Rendering Backend

AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence

Meta Reinforcement Learning with Task Embedding and Shared Policy

Meta reinforcement learning as task inference

Reinforcement Learning, Fast and Slow

Demonstration-Guided Deep Reinforcement Learning of Control Policies for Dexterous Human-Robot Interaction

Meta-Sim: Learning to Generate Synthetic Datasets

Active Domain Randomization

DeceptionNet: Network-Driven Domain Randomization

How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?

Concurrent Meta Reinforcement Learning

Learning Latent Plans from Play

Distilling Policy Distillation

Learning agile and dynamic motor skills for legged robots

Soft Actor-Critic Algorithms and Applications

Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

Learning To Simulate

Rubik's Cube Handling Using a High-Speed Multi-Fingered Hand and a High-Speed Vision System

Exploration by Random Network Distillation

Task-Oriented Hand Motion Retargeting for Dexterous Manipulation Imitation

Dynamic Task Prioritization for Multitask Learning

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning

SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning

Learning dexterous in-hand manipulation

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

Unsupervised Meta-Learning for Reinforcement Learning

AutoAugment: Learning Augmentation Policies from Data

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

Learning to Adapt: Meta-Learning for Model-Based Control

Meta Reinforcement Learning with Latent Variable Gaussian Processes

The Building Blocks of Interpretability

Reinforcement and Imitation Learning for Diverse Visuomotor Skills

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

Distributed Distributional Deterministic Policy Gradients

On Policy Learning Robust to Irreversible Events: An Application to Robotic In-Hand Manipulation

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

Asymmetric Actor Critic for Image-Based Robot Learning

Domain Randomization and Generative Models for Robotic Grasping

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning

Scaling SGD Batch Size to 32K for ImageNet Training

Proximal Policy Optimization Algorithms

Meta-Learning with Temporal Convolutions

Sampling-based Planning of In-Hand Manipulation with External Pushes

Teacher–Student Curriculum Learning

Automated Curriculum Learning for Neural Networks

Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics

Domain randomization for transferring deep neural networks from simulation to the real world

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning

Robust Adversarial Reinforcement Learning

Reinforcement Learning for Pivoting Task

Preparing for the Unknown: Learning a Universal Policy with Online System Identification

Experiments in Handwriting with a Neural Network

Learning to reinforcement learn

Learning Dexterous Manipulation Policies from Experience and Imitation

(CAD)$^2$RL: Real Single-Image Flight without a Single Real Image

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

Sim-to-Real Robot Learning from Pixels with Progressive Nets

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

Supervision via competition: Robot adversaries for learning tasks

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

Optimal control with learned local models: Application to dexterous manipulation

Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection

Mastering the game of Go with deep neural networks and tree search

Learning robot in-hand manipulation with tactile features

Dynamic in-hand sliding manipulation

Deep Residual Learning for Image Recognition