Poster
in
Workshop: Models of Human Feedback for AI Alignment

Learning to Assist Humans without Inferring Rewards

Vivek Myers ⋅ Evan Ellis ⋅ Benjamin Eysenbach ⋅ Sergey Levine ⋅ Anca Dragan

2024 Poster
in
Workshop: Models of Human Feedback for AI Alignment

Project Page [ Poster] [ OpenReview]

Abstract

Assistive agents should make humans' lives easier. Classically, such assistance is studied through the lens of inverse reinforcement learning, where an assistive agent (e.g., a chatbot, a robot) infers a human's intention and then selects actions to help the human reach that goal. This approach requires inferring intentions, which can be difficult in high-dimensional settings. We build upon prior work that studies assistance through the lens of empowerment: an assistive agent aims to maximize the influence of the human's actions such that they exert a greater control over the environmental outcomes and can solve tasks in fewer steps. We lift the major limitation of prior work in this area—scalability to high-dimensional settings—with contrastive successor representations. We formally prove that these representations estimate a similar notion of empowerment to that studied by prior work and provide a ready-made mechanism for optimizing it. Empirically, our proposed method outperforms prior methods on synthetic benchmarks, and scales to Overcooked, a cooperative game setting. Theoretically, our work connects ideas from information theory, neuroscience, and reinforcement learning, and charts a path for representations to play a critical role in solving assistive problems.

Video

Chat is not available.