Timezone: »
Oral
Online Linear Quadratic Control
Alon Cohen · Avinatan Hasidim · Tomer Koren · Nevena Lazic · Yishay Mansour · Kunal Talwar
We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to ``strongly stable'' policies that mix exponentially fast to a steady state.
Author Information
Alon Cohen (Google Inc.)
Avinatan Hasidim (Google)
Tomer Koren (Google Brain)
Nevena Lazic (Google)
Yishay Mansour (Google)
Kunal Talwar (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Online Linear Quadratic Control »
Fri. Jul 13th 04:15 -- 07:00 PM Room Hall B #115
More from the Same Authors
-
2021 : Adversarial Robustness of Streaming Algorithms through Importance Sampling »
Vladimir Braverman · Avinatan Hasidim · Yossi Matias · Mariano Schain · Sandeep Silwal · Samson Zhou -
2021 : Minimax Regret for Stochastic Shortest Path »
Alon Cohen · Yonathan Efroni · Yishay Mansour · Aviv Rosenberg -
2021 : Neural Rate Control for Video Encoding using Imitation Learning »
Hongzi Mao · Chenjie Gu · Miaosen Wang · Angie Chen · Nevena Lazic · Nir Levine · Derek Pang · Rene Claus · Marisabel Hechtman · Ching-Han Chiang · Cheng Chen · Jingning Han -
2022 Poster: Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation »
Chris Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2022 Spotlight: Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation »
Chris Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2021 Poster: Improved Regret Bound and Experience Replay in Regularized Policy Iteration »
Nevena Lazic · Dong Yin · Yasin Abbasi-Yadkori · Csaba Szepesvari -
2021 Oral: Improved Regret Bound and Experience Replay in Regularized Policy Iteration »
Nevena Lazic · Dong Yin · Yasin Abbasi-Yadkori · Csaba Szepesvari -
2020 Poster: Near-optimal Regret Bounds for Stochastic Shortest Path »
Aviv Rosenberg · Alon Cohen · Yishay Mansour · Haim Kaplan -
2020 Poster: Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently »
Asaf Cassel · Alon Cohen · Tomer Koren -
2019 Poster: POLITEX: Regret Bounds for Policy Iteration using Expert Prediction »
Yasin Abbasi-Yadkori · Peter Bartlett · Kush Bhatia · Nevena Lazic · Csaba Szepesvari · Gellért Weisz -
2019 Poster: Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret »
Alon Cohen · Tomer Koren · Yishay Mansour -
2019 Poster: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: POLITEX: Regret Bounds for Policy Iteration using Expert Prediction »
Yasin Abbasi-Yadkori · Peter Bartlett · Kush Bhatia · Nevena Lazic · Csaba Szepesvari · Gellért Weisz -
2019 Oral: Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret »
Alon Cohen · Tomer Koren · Yishay Mansour -
2019 Poster: Self-similar Epochs: Value in arrangement »
Eliav Buchnik · Edith Cohen · Avinatan Hasidim · Yossi Matias -
2019 Oral: Self-similar Epochs: Value in arrangement »
Eliav Buchnik · Edith Cohen · Avinatan Hasidim · Yossi Matias -
2018 Poster: Shampoo: Preconditioned Stochastic Tensor Optimization »
Vineet Gupta · Tomer Koren · Yoram Singer -
2018 Oral: Shampoo: Preconditioned Stochastic Tensor Optimization »
Vineet Gupta · Tomer Koren · Yoram Singer