Timezone: »

Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Yevgen Chebotar · Karol Hausman · Marvin Zhang · Gaurav Sukhatme · Stefan Schaal · Sergey Levine

Tue Aug 08 05:48 PM -- 06:06 PM (PDT) @ C4.5

Reinforcement learning algorithms for real-world robotic applications must be able to handle complex, unknown dynamical systems while maintaining data-efficient learning. These requirements are handled well by model-free and model-based RL approaches, respectively. In this work, we aim to combine the advantages of these approaches. By focusing on time-varying linear-Gaussian policies, we enable a model-based algorithm based on the linear-quadratic regulator that can be integrated into the model-free framework of path integral policy improvement. We can further combine our method with guided policy search to train arbitrary parameterized policies such as deep neural networks. Our simulation and real-world experiments demonstrate that this method can solve challenging manipulation tasks with comparable or better performance than model-free methods while maintaining the sample efficiency of model-based methods.

Author Information

Yevgen Chebotar (University of Southern California)
Karol Hausman (University of Southern California)
Marvin Zhang (UC Berkeley)
Gaurav Sukhatme (University of Southern California)
Stefan Schaal
Sergey Levine (Berkeley)
Sergey Levine

Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors