Timezone: »

 
Koopman Constrained Policy Optimization: A Koopman operator theoretic method for differentiable optimal control in robotics
Matthew Retchin · Brandon Amos · Steven Brunton · Shuran Song
Event URL: https://openreview.net/forum?id=3W7vPqWCeM »

We introduce Koopman Constrained Policy Optimization (KCPO), combining implicitly differentiable model predictive control with a deep Koopman autoencoder for robot learning in unknown and nonlinear dynamical systems. KCPO is a new policy optimization algorithm that trains neural policies end-to-end with hard box constraints on controls. Guaranteed satisfaction of hard constraints helps ensure the performance and safety of robots. We perform imitation learning with KCPO to recover expert policies on the Simple Pendulum, Cartpole Swing-Up, Reacher, and Differential Drive environments, outperforming baseline methods in generalizing to out-of-distribution constraints in most environments after training.

Author Information

Matthew Retchin (Columbia Artificial Intelligence and Robotics Lab)
Brandon Amos (Meta)
Steven Brunton (Princeton University)
Shuran Song (Columbia University)

More from the Same Authors