Synthesizing Ne… (2018-03-06T00:00:00.000000Z)

TL;DR

An algorithm for rapidly learning neural network policies for robotics systems that follows the model-based reinforcement learning paradigm and improves upon existing algorithms: PILeO and a sample-based version of PILeo with neural network dynamics (Deep-PILeO).

Abstract

Ahstract- We present an algorithm for rapidly learning neural network policies for robotics systems. The algorithm follows the model-based reinforcement learning paradigm and improves upon existing algorithms: PILeO and a sample-based version of PILeo with neural network dynamics (Deep-PILeO). To improve convergence, we propose a model-based algorithm that uses fixed random numbers and clips gradients during optimization. We propose training a neural network dynamics model using variational dropout with truncated Log-Normal noise. These improvements enable data-efficient synthesis of complex neural network policies. We test our approach on a variety of benchmark tasks, demonstrating data-efficiency that is competitive with that of PILeO, while being able to optimize complex neural network controllers. Finally, we assess the performance of the algorithm for learning motor controllers for a six legged autonomous underwater vehicle. This demonstrates the potential of the algorithm for scaling up the dimensionality and dataset sizes, in more complex tasks.

Authors

D. Meger

6 papers

J. A. G. Higuera

1 papers

G. Dudek

1 papers

References32 items

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

Concrete Dropout

GP-ILQG: Data-driven Robust Optimal Control for Uncertain Nonlinear Dynamical Systems

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Black-box data-efficient policy search for robotics

Abstract

TL;DR

Abstract

Authors

References32 items

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

Concrete Dropout

GP-ILQG: Data-driven Robust Optimal Control for Uncertain Nonlinear Dynamical Systems

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Black-box data-efficient policy search for robotics

TL;DR

Abstract

Authors

References32 items

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

Concrete Dropout

GP-ILQG: Data-driven Robust Optimal Control for Uncertain Nonlinear Dynamical Systems

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Black-box data-efficient policy search for robotics

Variational Dropout Sparsifies Deep Neural Networks

Continuous Deep Q-Learning with Model-based Acceleration

Learning Continuous Control Policies by Stochastic Value Gradients

Continuous control with deep reinforcement learning

Weight Uncertainty in Neural Network

Variational Dropout and the Local Reparameterization Trick

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

Learning legged swimming gaits from experience

Dynamics and trajectory optimization for a soft spatial fluidic elastomer manipulator

Gaussian Processes for Data-Efficient Learning in Robotics and Control

Adam: A Method for Stochastic Optimization

Probabilistic Differential Dynamic Programming

Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics

Control-limited differential dynamic programming

On the difficulty of training recurrent neural networks

Model learning for robot control: a survey

Efficient reinforcement learning using Gaussian processes

Sparse Spectrum Gaussian Process Regression

An Application of Reinforcement Learning to Aerobatic Helicopter Flight

PEGASUS: A policy search method for large MDPs and POMDPs

Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers

A comparison of direct and model-based reinforcement learning

Locally Weighted Learning for Control

Differential dynamic programming

Improving PILCO with Bayesian Neural Network Dynamics Models

Dropout: a simple way to prevent neural networks from overfitting

Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems

Field of Study

Journal Information

Name

Page

Venue Information

Name

Type

URL

Alternate Names