Skip to yearly menu bar Skip to main content


Talk
in
Workshop: Theoretical Foundations of Reinforcement Learning

Short Talk 4 - Adaptive Regret for Online Control

Edgar Minasyan


Abstract:

We consider regret minimization for online control with time-varying linear dynamical systems. The metric of performance we study is adaptive policy regret, or regret compared to the best policy on {\it any interval in time}. We give an efficient algorithm that attains first-order adaptive regret guarantees for the setting of online convex optimization with memory, subsequently used to derive a controller with such guarantees. We show that these bounds are nearly tight and validate these theoretical findings experimentally on simulations of time-varying dynamics and disturbances.

Paula Gradu, Elad Hazan, Edgar Minasyan

Chat is not available.