Timezone: »
Poster
Logarithmic Regret for Adversarial Online Control
Dylan Foster · Max Simchowitz
We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances. Existing regret bounds for this setting scale as $\sqrt{T}$ unless strong stochastic assumptions are imposed on the disturbance process. We give the first algorithm with logarithmic regret for arbitrary adversarial disturbance sequences, provided the state and control costs are given by known quadratic functions. Our algorithm and analysis use a characterization for the optimal offline control law to reduce the online control problem to (delayed) online learning with approximate advantage functions. Compared to previous techniques, our approach does not need to control movement costs for the iterates, leading to logarithmic regret.
Author Information
Dylan Foster (MIT)
Max Simchowitz (UC Berkeley)
More from the Same Authors
-
2022 : Interaction-Grounded Learning with Action-inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 Poster: Contextual Bandits with Large Action Spaces: Made Practical »
Yinglun Zhu · Dylan Foster · John Langford · Paul Mineiro -
2022 Spotlight: Contextual Bandits with Large Action Spaces: Made Practical »
Yinglun Zhu · Dylan Foster · John Langford · Paul Mineiro -
2022 : Q&A II »
Dylan Foster · Alexander Rakhlin -
2022 : Bridging Learning and Decision Making: Part II »
Dylan Foster -
2022 : Q&A »
Dylan Foster · Alexander Rakhlin -
2022 Tutorial: Bridging Learning and Decision Making »
Dylan Foster · Alexander Rakhlin -
2021 Poster: Task-Optimal Exploration in Linear Dynamical Systems »
Andrew Wagenmaker · Max Simchowitz · Kevin Jamieson -
2021 Oral: Task-Optimal Exploration in Linear Dynamical Systems »
Andrew Wagenmaker · Max Simchowitz · Kevin Jamieson -
2020 Workshop: Theoretical Foundations of Reinforcement Learning »
Emma Brunskill · Thodoris Lykouris · Max Simchowitz · Wen Sun · Mengdi Wang -
2020 Poster: Naive Exploration is Optimal for Online LQR »
Max Simchowitz · Dylan Foster -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Improved Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance »
Blair Bilodeau · Dylan Foster · Daniel Roy -
2020 Poster: Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning »
Esther Rolf · Max Simchowitz · Sarah Dean · Lydia T. Liu · Daniel Bjorkegren · Moritz Hardt · Joshua Blumenstock -
2020 Poster: Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles »
Dylan Foster · Alexander Rakhlin -
2019 Poster: Distributed Learning with Sublinear Communication »
Jayadev Acharya · Christopher De Sa · Dylan Foster · Karthik Sridharan -
2019 Oral: Distributed Learning with Sublinear Communication »
Jayadev Acharya · Christopher De Sa · Dylan Foster · Karthik Sridharan -
2019 Poster: The Implicit Fairness Criterion of Unconstrained Learning »
Lydia T. Liu · Max Simchowitz · Moritz Hardt -
2019 Oral: The Implicit Fairness Criterion of Unconstrained Learning »
Lydia T. Liu · Max Simchowitz · Moritz Hardt -
2018 Poster: Delayed Impact of Fair Machine Learning »
Lydia T. Liu · Sarah Dean · Esther Rolf · Max Simchowitz · Moritz Hardt -
2018 Poster: Practical Contextual Bandits with Regression Oracles »
Dylan Foster · Alekh Agarwal · Miroslav Dudik · Haipeng Luo · Robert Schapire -
2018 Oral: Practical Contextual Bandits with Regression Oracles »
Dylan Foster · Alekh Agarwal · Miroslav Dudik · Haipeng Luo · Robert Schapire -
2018 Oral: Delayed Impact of Fair Machine Learning »
Lydia T. Liu · Sarah Dean · Esther Rolf · Max Simchowitz · Moritz Hardt