Timezone: »
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown. Recent results in this setting have demonstrated efficient learning algorithms with regret growing with the square root of the number of decision steps. We present new efficient algorithms that achieve, perhaps surprisingly,regret that scales only (poly-)logarithmically with the number of steps, in two scenarios: when only the state transition matrix A is unknown, and when only the state-action transition matrix B is unknown and the optimal policy satisfies a certain non-degeneracy condition. On the other hand, we give a lower bound which shows that when the latter condition is violated, square root regret is unavoidable.
Author Information
Asaf Cassel (Tel Aviv University)
Alon Cohen (Technion, Google)
Tomer Koren (Tel Aviv University)
More from the Same Authors
-
2023 Poster: Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation »
Orin Levy · Alon Cohen · Asaf Cassel · Yishay Mansour -
2021 Poster: Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret »
Asaf Cassel · Tomer Koren -
2021 Spotlight: Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret »
Asaf Cassel · Tomer Koren -
2019 Poster: Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret »
Alon Cohen · Tomer Koren · Yishay Mansour -
2019 Poster: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret »
Alon Cohen · Tomer Koren · Yishay Mansour -
2018 Poster: Online Linear Quadratic Control »
Alon Cohen · Avinatan Hasidim · Tomer Koren · Nevena Lazic · Yishay Mansour · Kunal Talwar -
2018 Oral: Online Linear Quadratic Control »
Alon Cohen · Avinatan Hasidim · Tomer Koren · Nevena Lazic · Yishay Mansour · Kunal Talwar -
2018 Poster: Shampoo: Preconditioned Stochastic Tensor Optimization »
Vineet Gupta · Tomer Koren · Yoram Singer -
2018 Oral: Shampoo: Preconditioned Stochastic Tensor Optimization »
Vineet Gupta · Tomer Koren · Yoram Singer