Daniel Brown: Pitfalls and paths forward when learning rewards from human feedback
2023 Invited talk
in
Workshop: Interactive Learning with Implicit Human Feedback
in
Workshop: Interactive Learning with Implicit Human Feedback
Abstract
Human feedback is often incomplete, suboptimal, biased, and ambiguous, leading to misidentification of the human's true reward function and suboptimal agent behavior. I will discuss these pitfalls as well as some of our recent work that seeks to overcome these problems via techniques that calibrate to user biases, learn from multiple feedback types, use human feedback to align robot feature representations, and enable interpretable reward learning.
Video
Chat is not available.
Successful Page Load