Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

Koopman Constrained Policy Optimization: A Koopman operator theoretic method for differentiable optimal control in robotics

Matthew Retchin · Brandon Amos · Steven Brunton · Shuran Song


We introduce Koopman Constrained Policy Optimization (KCPO), combining implicitly differentiable model predictive control with a deep Koopman autoencoder for robot learning in unknown and nonlinear dynamical systems. KCPO is a new policy optimization algorithm that trains neural policies end-to-end with hard box constraints on controls. Guaranteed satisfaction of hard constraints helps ensure the performance and safety of robots. We perform imitation learning with KCPO to recover expert policies on the Simple Pendulum, Cartpole Swing-Up, Reacher, and Differential Drive environments, outperforming baseline methods in generalizing to out-of-distribution constraints in most environments after training.

Chat is not available.