Timezone: »

Reward Identification in Inverse Reinforcement Learning
Kuno Kim · Shivam Garg · Kirankumar Shiragur · Stefano Ermon

Wed Jul 21 06:30 AM -- 06:35 AM (PDT) @ None

We study the problem of reward identifiability in the context of Inverse Reinforcement Learning (IRL). The reward identifiability question is critical to answer when reasoning about the effectiveness of using Markov Decision Processes (MDPs) as computational models of real world decision makers in order to understand complex decision making behavior and perform counterfactual reasoning. While identifiability has been acknowledged as a fundamental theoretical question in IRL, little is known about the types of MDPs for which rewards are identifiable, or even if there exist such MDPs. In this work, we formalize the reward identification problem in IRL and study how identifiability relates to properties of the MDP model. For deterministic MDP models with the MaxEntRL objective, we prove necessary and sufficient conditions for identifiability. Building on these results, we present efficient algorithms for testing whether or not an MDP model is identifiable.

Author Information

Kuno Kim (Stanford University)
Shivam Garg (Stanford University)
Kirankumar Shiragur (Stanford University)
Stefano Ermon (Stanford University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors