Skip to yearly menu bar Skip to main content


Morning Poster
in
Workshop: Artificial Intelligence & Human Computer Interaction

Active Reinforcement Learning from Demonstration in Continuous Action Spaces

Ming-Hsin Chen · Si-An Chen · Hsuan-Tien (Tien) Lin


Abstract:

Learning from Demonstration (LfD) is a human-in-the-loop paradigm that aims to overcome the limitations of safety considerations and weak data efficiency in Reinforcement Learning (RL). Active Reinforcement Learning from Demonstration (ARLD) takes LfD a step further by actively involving the human expert only during critical moments, reducing the costs associated with demonstrations. While successful ARLD strategies have been developed for RL environments with discrete actions, their potential in continuous action environments has not been thoroughly explored. In this work, we propose a novel ARLD strategy specifically designed for continuous environments. Our strategy involves estimating the uncertainty of the current RL agent directly from the variance of the stochastic policy within the state-of-the-art Soft Actor-Critic RL model. We demonstrate that our strategy outperforms both a naive attempt to adapt existing ARLD strategies to continuous environments and the passive LfD strategy. These results validate the potential of ARLD in continuous environments and lay the foundation for future research in this direction.

Chat is not available.