1
MOReL : Model-Based Offline Reinforcement Learning
2
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
3
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
4
Interference and Generalization in Temporal Difference Learning
5
Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning
6
AlgaeDICE: Policy Gradient from Arbitrary Experience
7
RoboNet: Large-Scale Multi-Robot Learning
8
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
9
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
10
Behavior Regularized Offline Reinforcement Learning
11
Deep Dynamics Models for Learning Dexterous Manipulation
12
Striving for Simplicity in Off-policy Deep Reinforcement Learning
13
Benchmarking Model-Based Reinforcement Learning
14
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
15
Exploring Model-based Planning with Policy Networks
16
When to Trust Your Model: Model-Based Policy Optimization
17
Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
18
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
19
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
20
Model-Based Reinforcement Learning for Atari
21
Guidelines for reinforcement learning in healthcare
22
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
23
Off-Policy Deep Reinforcement Learning without Exploration
24
Quantifying Generalization in Reinforcement Learning
25
Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control
26
Model-Based Reinforcement Learning via Meta-Policy Optimization
27
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
28
Accurate Uncertainties for Deep Learning Using Calibrated Regression
29
A Dissection of Overfitting and Generalization in Continuous Reinforcement Learning
30
The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces
31
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
32
BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling
34
Addressing Function Approximation Error in Actor-Critic Methods
35
Spectral Normalization for Generative Adversarial Networks
36
Model-Ensemble Trust-Region Policy Optimization
37
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
38
Proximal Policy Optimization Algorithms
39
Imagination-Augmented Agents for Deep Reinforcement Learning
40
Value Prediction Network
41
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
42
The Predictron: End-To-End Learning and Planning
43
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
44
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
45
Deep visual foresight for planning robot motion
46
SQuAD: 100,000+ Questions for Machine Comprehension of Text
47
Safe and Efficient Off-Policy Reinforcement Learning
48
Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks
49
Optimal control with learned local models: Application to dexterous manipulation
50
Value Iteration Networks
51
Asynchronous Methods for Deep Reinforcement Learning
52
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
53
Continuous control with deep reinforcement learning
54
Trust Region Policy Optimization
55
Wiener's Polynomial Chaos for the Analysis and Control of Nonlinear Dynamical Systems with Probabilistic Uncertainties [Historical Perspectives]
57
MuJoCo: A physics engine for model-based control
58
Linear Off-Policy Actor-Critic
59
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
60
Scalable Approach to Uncertainty Quantification and Robust Design of Interconnected Dynamical Systems
61
ImageNet: A large-scale hierarchical image database
62
On integral probability metrics, φ-divergences and binary classification
63
An analysis of model-based Interval Estimation for Markov Decision Processes
64
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
65
A Kernel Approach to Comparing Distributions
66
Off-Policy Temporal Difference Learning with Function Approximation
67
Integral Probability Metrics and Their Generating Classes of Functions
68
Model Predictive Control Using Neural Networks [25 Years Ago]
69
Dyna, an integrated architecture for learning, planning, and reacting
70
Neuronlike adaptive elements that can solve difficult learning control problems
71
Some Asymptotic Theory for the Bootstrap
73
Improving PILCO with Bayesian Neural Network Dynamics Models
74
Safe Reinforcement Learning
75
Batch Reinforcement Learning
76
Multi-Step Dyna Planning for Policy Evaluation and Control
77
• Uncertainty measurement drives even larger gaps between theory and empirical algorithm