Timezone: »
Poster
Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret
Asaf Cassel · Tomer Koren
We consider the task of learning to control a linear dynamical system under fixed quadratic costs, known as the Linear Quadratic Regulator (LQR) problem. While model-free approaches are often favorable in practice, thus far only model-based methods, which rely on costly system identification, have been shown to achieve regret that scales with the optimal dependence on the time horizon T. We present the first model-free algorithm that achieves similar regret guarantees. Our method relies on an efficient policy gradient scheme, and a novel and tighter analysis of the cost of exploration in policy space in this setting.
Author Information
Asaf Cassel (Tel Aviv University)
Tomer Koren (Tel Aviv University and Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret »
Wed. Jul 21st 01:25 -- 01:30 PM Room
More from the Same Authors
-
2023 Poster: Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime »
Hilal Asi · Vitaly Feldman · Tomer Koren · Kunal Talwar -
2023 Poster: Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation »
Uri Sherman · Tomer Koren · Yishay Mansour -
2023 Poster: Regret Minimization and Convergence to Equilibria in General-sum Markov Games »
Liad Erez · Tal Lancewicki · Uri Sherman · Tomer Koren · Yishay Mansour -
2023 Poster: Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation »
Orin Levy · Alon Cohen · Asaf Cassel · Yishay Mansour -
2023 Poster: SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance »
Amit Attia · Tomer Koren -
2021 Poster: Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry »
Hilal Asi · Vitaly Feldman · Tomer Koren · Kunal Talwar -
2021 Oral: Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry »
Hilal Asi · Vitaly Feldman · Tomer Koren · Kunal Talwar -
2021 Poster: Adversarial Dueling Bandits »
Aadirupa Saha · Tomer Koren · Yishay Mansour -
2021 Spotlight: Adversarial Dueling Bandits »
Aadirupa Saha · Tomer Koren · Yishay Mansour -
2021 Poster: Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions »
Tal Lancewicki · Shahar Segal · Tomer Koren · Yishay Mansour -
2021 Spotlight: Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions »
Tal Lancewicki · Shahar Segal · Tomer Koren · Yishay Mansour -
2021 Poster: Dueling Convex Optimization »
Aadirupa Saha · Tomer Koren · Yishay Mansour -
2021 Spotlight: Dueling Convex Optimization »
Aadirupa Saha · Tomer Koren · Yishay Mansour -
2020 Poster: Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently »
Asaf Cassel · Alon Cohen · Tomer Koren