Title: Reward (Mis)design for Autonomous Driving and Accumulating Safety Rules from Catastrophic Action Effects
This talk highlights two recent findings pertaining to safety of autonomous driving. The first is that most RL researchers use reward functions that are riskier than those reflected by drunk teenage drivers; the second is a method for monotonically improving safety of a fleet of vehicles over time.