Timezone: »
Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. We show that, for a certain subclass of the MDP framework, this can be alleviated by a neuro-algorithmic policy architecture that embeds a time-dependent shortest path solver in a deep neural network. Trained end-to-end via blackbox-differentiation, this method leads to considerable improvement in generalization capabilities in the low-data regime.
Author Information
Marin Vlastelica (Max Planck Institute for Intelligent Systems)
Michal Rolinek (Max Planck Institute for Intelligent Systems)
Georg Martius (Max Planck Institute for Intelligent Systems)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Neuro-algorithmic Policies Enable Fast Combinatorial Generalization »
Tue. Jul 20th 01:25 -- 01:30 PM Room None
More from the Same Authors
-
2020 : Discrete Planning with End-to-end Trained Neuro-algorithmic Policies »
Marin Vlastelica -
2020 : (#46 / Sess. 2) Discrete Planning with End-to-end Trained Neuro-algorithmic Policies »
Marin Vlastelica -
2021 : Planning from Pixels in Environments with Combinatorially Hard Search Spaces »
Marco Bagatella · Miroslav Olšák · Michal Rolinek · Georg Martius -
2021 : Oral Presentation: Planning from Pixels in Environments with Combinatorially Hard Search Spaces »
Georg Martius · Marco Bagatella -
2021 Poster: Demystifying Inductive Biases for (Beta-)VAE Based Architectures »
Dominik Zietlow · Michal Rolinek · Georg Martius -
2021 Spotlight: Demystifying Inductive Biases for (Beta-)VAE Based Architectures »
Dominik Zietlow · Michal Rolinek · Georg Martius -
2021 Poster: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints »
Anselm Paulus · Michal Rolinek · Vit Musil · Brandon Amos · Georg Martius -
2021 Spotlight: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints »
Anselm Paulus · Michal Rolinek · Vit Musil · Brandon Amos · Georg Martius -
2018 Poster: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius -
2018 Oral: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius