Poster
in
Workshop: Responsible Decision Making in Dynamic Environments
Optimal Dynamic Regret in LQR Control
Dheeraj Baby · Yu-Xiang Wang
Abstract:
We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. We provide an efficient online algorithm that achieves an optimal dynamic (policy) regret of , where is the total variation of any oracle sequence of \emph{Disturbance Action} policies parameterized by --- chosen in hindsight to cater to unknown nonstationarity. The rate improves the best known rate of for general convex losses and is information-theoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster and Simchowitz 2020, as well as a new \emph{proper} learning algorithm with an optimal dynamic regret on a family of minibatched'' quadratic losses, which could be of independent interest.
Chat is not available.