Poster
in
Workshop: Foundations of Reinforcement Learning and Control: Connections and Perspectives
Safe Reinforcement Learning with Contrastive Risk Prediction
Hanping Zhang · Yuhong Guo
As safety violations can lead to severe consequences in real-world applications, the increasing deployment of Reinforcement Learning (RL) in safety-critical domains such as robotics has propelled the study of safe exploration for reinforcement learning (safe RL). In this work, we propose a risk preventive training method for safe RL, which learns a binary classifier based on contrastive sampling to predict the probability of a state-action pair leading to unsafe states. Based on the predicted risk probabilities, risk preventive trajectory exploration and optimality criterion modification can be simultaneously conducted to induce safe RL policies. We conduct experiments in robotic simulation environments. The results show the proposed approach outperforms existing model-free safe RL approaches, and yields comparable performance with the state-of-the-art model-based method.