Skip to yearly menu bar Skip to main content

Workshop: Complex feedback in online learning

Adversarial Attacks Against Imitation and Inverse Reinforcement Learning

Ezgi Korkmaz


Learning from raw high-dimensional observations became possible with the help of deep neural networks. With this initial enhancement the progress of reinforcement learning research is experiencing one of its highest peaks. The policies trained with deep reinforcement learning are being deployed in many different settings from medical to industrial control. Yet the fact that reinforcement learning still needs a reward function to learn functioning policies can be restrictive for certain types of tasks. To be able to learn in these tasks several studies focused on different ways of learning by observing a set of trajectories from an optimal policy. One line of research on this focuses on learning a reward function from a set of observations, referred to as inverse reinforcement learning, and another line focuses on learning an optimal policy from the observations of trajectories, referred to as imitation learning. In our paper we investigate the robustness of the state-of-the-art deep imitation learning policies and deep inverse reinforcement learning policies towards adversarial vectors. We demonstrate that the simple vanilla trained deep reinforcement learning policies are more robust compared to deep inverse reinforcement learning and deep imitation learning policies trained in complex state representation MDPs.

Chat is not available.