Skip to yearly menu bar Skip to main content


Learning Optimal Advantage from Preferences and Mistaking it for Reward

William Knox ⋅ Stephane Hatgis-Kessell ⋅ Sigurdur Adalgeirsson ⋅ Serena Booth ⋅ Anca Dragan ⋅ Peter Stone ⋅ Scott Niekum

Abstract

Video

Chat is not available.