Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

robots-7

Safe Exploration

3260 papers • 126 benchmarks • 313 datasets

Safe Exploration is an approach to collect ground truth data by safely interacting with the environment. Source: Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in safe-exploration-7

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find safe-exploration-7 models and implementations

zlr20/saferl_kit

2 papers 52

Datasets

No datasets available.

Subtasks

No subtasks available.

Most implemented papers

Safe Exploration in Continuous Action Spaces

Yuval Tassa, Todd Hester, Matej Vecerík, K. Dvijotham, Cosmin Paduraru, Gal Dalal•Thu Jan 25 2018

This work addresses the problem of deploying a reinforcement learning agent on a physical system such as a datacenter cooling unit or robot, where critical constraints must never be violated, and directly adds to the policy a safety layer that analytically solves an action correction formulation per each state.

484

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

0

Feasible Actor-Critic: Constrained Reinforcement Learning for Ensuring Statewise Safety

Sifa Zheng, Yang Guan, Haitong Ma, Shegnbo Eben Li, Xiangteng Zhang, Jianyu Chen•Fri May 21 2021

The feasible actor-critic (FAC) algorithm is introduced, which is the first model-free constrained RL method that considers statewise safety, e.g, safety for each initial state, and theoretical guarantees that FAC outperforms previous expectation-based constrained RL methods in terms of both constraint satisfaction and reward optimization.

43 0

AI Safety Gridworlds

Jan Leike, Miljan Martic, S. Legg, Andrew Lefrancq, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Laurent Orseau•Sun Nov 26 2017

A suite of reinforcement learning environments illustrating various safety properties of intelligent agents, including safe interruptibility, avoiding side effects, absent supervisor, reward gaming, safe exploration, as well as robustness to self-modification, distributional shift, and adversaries are presented.

276 0

SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical Difference Measure

K. Aslansefat, Ioannis Sorokos, D. Whiting, Ramin Tavakoli Kolagari, Y. Papadopoulos•Tue May 26 2020

The preliminary findings indicate that the approach can provide a basis for detecting whether the application context of an ML component is valid in the safety-security in the Empirical Cumulative Distribution Function (ECDF).

44 0

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

Bolei Zhou, Qihang Zhang, Zhenghao Peng, Quanyi Li, Zhenghai Xue•Sat Sep 25 2021

The generalization experiments conducted on both procedurally generated scenarios and real-world scenarios show that increasing the diversity and the size of the training set leads to the improvement of the RL agent's generalizability.

361 0

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes

Andreas Krause, Felix Berkenkamp, M. Turchetta•Tue May 31 2016

A novel algorithm is developed and proved that it is able to completely explore the safely reachable part of the MDP without violating the safety constraint, and is demonstrated on digital terrain models for the task of exploring an unknown map with a rover.

200 0

Concrete Problems in AI Safety

P. Christiano, John Schulman, Dario Amodei, Dandelion Mané, J. Steinhardt, Chris Olah•Mon Jun 20 2016

A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.

2830 0

Learning-Based Model Predictive Control for Safe Exploration

A. Krause, Felix Berkenkamp, Torsten Koller, M. Turchetta•Wed Mar 21 2018

This paper presents a learning-based model predictive control scheme that can provide provable high-probability safety guarantees and exploits regularity assumptions on the dynamics in terms of a Gaussian process prior to construct provably accurate confidence intervals on predicted trajectories.

410 0

Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning

Andreas Krause, Felix Berkenkamp, Torsten Koller, M. Turchetta, Joschka Bödecker•Wed Jun 26 2019

This paper presents a learning-based model predictive control scheme that provides high-probability safety guarantees throughout the learning process, and constructs provably accurate confidence intervals on predicted trajectories based on a reliable statistical model.

55 0

Safe Exploration for Optimizing Contextual Bandits

R. Jagerman, M. de Rijke, Ilya Markov•Sat Feb 01 2020

This work introduces a new learning method for contextual bandit problems, Safe Exploration Algorithm (SEA), which overcomes the above drawbacks and never performs worse than the baseline policy and does not harm the user experience, while still exploring the action space and, thus, being able to find an optimal policy.

21 0

Adding a benchmark result helps the community track progress.

Safe Exploration | State-of-the-Art