Poster
Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning
Yuhui Wang · Qingyuan Wu · Dylan Ashley · Francesco Faccio · Weida Li · Chao Huang · Jürgen Schmidhuber
West Exhibition Hall B2-B3 #W-601
Planning is an essential skill for intelligent agents, enabling them to figure out how to reach goals in complex environments. A popular method, called Value Iteration Networks (VINs), allows artificial agents to plan by mimicking how humans and robots think ahead. However, VINs fail when the environment becomes large or the task requires many steps to complete. In this work, we propose an improved version of VIN, called Dynamic Transition VIN (DT-VIN). It introduces two key ideas: (1) a more flexible internal model that better captures the structure of the environment, and (2) a special training technique that helps extremely deep networks learn efficiently. These changes allow our model to plan across 5,000 steps—far more than previous methods. We test DT-VIN in a range of tasks, from simple maze navigation to controlling robots and planning routes for lunar rovers. Across all these tasks, DT-VIN consistently outperforms existing methods, showing that it is better at solving long, complicated planning problems. Our work brings AI one step closer to handling real-world challenges that involve complex, long-term decision-making.
Live content is unavailable. Log in and register to view live content