Timezone: »

Understanding the Generalization Gap in Visual Reinforcement Learning
Anurag Ajay · Ge Yang · Ofir Nachum · Pulkit Agrawal

Deep Reinforcement Learning (RL) agents have achieved superhuman performance on several video game suites. However, unlike humans the trained policies fail to transfer between related games or even between different levels of the same game. Recent works have shown that ideas such as data augmentation and learning domain invariant features can reduce this generalization gap. However, the transfer performance still remains unsatisfactory. In this work we use procedurally generated video games to empirically investigate several hypotheses to explain the lack of transfer. Contrary to the belief that lack of generalizable visual features results in poor policy generalization, we find that visual features transfer across levels, but the inability to use these features to predict actions in new levels limits the overall transfer. We also show that simple auxiliary tasks can improve generalization and lead to policies that transfer as well as the state of the art methods using data augmentation. Finally, to inform fruitful avenues for future research, we construct simple oracle methods that close the generalization gap.

Author Information

Anurag Ajay (MIT)
Ge Yang (University of Chicago)
Ofir Nachum (Google Brain)
Pulkit Agrawal (MIT)

More from the Same Authors