Timezone: »
The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games to continuous control; these methods however do not come without significant issues, such as limited generalization capabilities and difficulties when dealing with combinatorially hard planning instances. Our contribution is two-fold: (i) we present a method that learns to represent its environment as a latent graph and leverages state reidentification to reduce the complexity of finding a good policy from exponential to linear (ii) we introduce a set of lightweight environments with an underlying discrete combinatorial structure in which planning is challenging even for humans. Moreover, we show that our methods achieves strong empirical generalization to variations in the environment, even across highly disadvantaged regimes, such as “one-shot” planning, or in an offline RL paradigm which only provides low-quality trajectories.
Author Information
Marco Bagatella (Max Planck Institute for Intelligent Systems, Max Planck Institute for Intelligent Systems)
Miroslav Olšák
Michal Rolinek (Max Planck Institute for Intelligent Systems, Max-Planck Institute)
Georg Martius (Max Planck Institute for Intelligent Systems)
More from the Same Authors
-
2021 : Oral Presentation: Planning from Pixels in Environments with Combinatorially Hard Search Spaces »
Georg Martius · Marco Bagatella -
2021 Poster: Demystifying Inductive Biases for (Beta-)VAE Based Architectures »
Dominik Zietlow · Michal Rolinek · Georg Martius -
2021 Spotlight: Demystifying Inductive Biases for (Beta-)VAE Based Architectures »
Dominik Zietlow · Michal Rolinek · Georg Martius -
2021 Poster: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints »
Anselm Paulus · Michal Rolinek · Vit Musil · Brandon Amos · Georg Martius -
2021 Spotlight: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints »
Anselm Paulus · Michal Rolinek · Vit Musil · Brandon Amos · Georg Martius -
2021 Poster: Neuro-algorithmic Policies Enable Fast Combinatorial Generalization »
Marin Vlastelica · Michal Rolinek · Georg Martius -
2021 Spotlight: Neuro-algorithmic Policies Enable Fast Combinatorial Generalization »
Marin Vlastelica · Michal Rolinek · Georg Martius -
2018 Poster: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius -
2018 Oral: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius