Timezone: »
Oral
Online learning with kernel losses
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett
We present a generalization of the adversarial linear bandits framework, where the underlying losses are kernel functions (with an associated reproducing kernel Hilbert space) rather than linear functions. We study a version of the exponential weights algorithm and bound its regret in this setting. Under conditions on the eigen-decay of the kernel we provide a sharp characterization of the regret for this algorithm. When we have polynomial eigen-decay ($\mu_j \le \mathcal{O}(j^{-\beta})$), we find that the regret is bounded by $\mathcal{R}_n \le \mathcal{O}(n^{\beta/(2\beta-1)})$. While under the assumption of exponential eigen-decay ($\mu_j \le \mathcal{O}(e^{-\beta j })$) we get an even tighter bound on the regret $\mathcal{R}_n \le \tilde{\mathcal{O}}(n^{1/2})$. When the eigen-decay is polynomial we show a \emph{non-matching} minimax lower bound on the regret of $\mathcal{R}_n \ge \Omega(n^{(\beta+1)/2\beta})$ and a lower bound of $\mathcal{R}_n \ge \Omega(n^{1/2})$ when the decay in the eigen-values is exponentially fast.
We also study the full information setting when the underlying losses are kernel functions and present an adapted exponential weights algorithm and a conditional gradient descent algorithm.
Author Information
Niladri Chatterji (UC Berkeley)
Aldo Pacchiano (UC Berkeley)
Peter Bartlett (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Online learning with kernel losses »
Wed. Jun 12th 01:30 -- 04:00 AM Room Pacific Ballroom #185
More from the Same Authors
-
2021 : Finite-Sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime »
Niladri Chatterji · Phil Long -
2021 : When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations? »
Niladri Chatterji · Phil Long · Peter Bartlett -
2021 : Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 : Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection »
Matteo Papini · Andrea Tirinzoni · Aldo Pacchiano · Marcello Restelli · Alessandro Lazaric · Matteo Pirotta -
2021 : Estimating Optimal Policy Value in Linear Contextual Bandits beyond Gaussianity »
Jonathan Lee · Weihao Kong · Aldo Pacchiano · Vidya Muthukumar · Emma Brunskill -
2021 : Meta Learning MDPs with linear transition models »
Robert Müller · Aldo Pacchiano · Jack Parker-Holder -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2023 : Experiment Planning with Function Approximation »
Aldo Pacchiano · Jonathan Lee · Emma Brunskill -
2023 : Anytime Model Selection in Linear Bandits »
Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause -
2023 : Undo Maps: A Tool for Adapting Policies to Perceptual Distortions »
Abhi Gupta · Ted Moskovitz · David Alvarez-Melis · Aldo Pacchiano -
2023 : In-Context Decision-Making from Supervised Pretraining »
Jonathan Lee · Annie Xie · Aldo Pacchiano · Yash Chandak · Chelsea Finn · Ofir Nachum · Emma Brunskill -
2023 : Experiment Planning with Function Approximation »
Aldo Pacchiano · Jonathan Lee · Emma Brunskill -
2023 : Anytime Model Selection in Linear Bandits »
Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause -
2023 Poster: Leveraging Offline Data in Online Reinforcement Learning »
Andrew Wagenmaker · Aldo Pacchiano -
2023 Poster: Deep linear networks can benignly overfit when shallow ones do »
Niladri S. Chatterji · Phil Long -
2022 Poster: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2022 Spotlight: Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback »
Tianyi Lin · Aldo Pacchiano · Yaodong Yu · Michael Jordan -
2021 : On the Theory of Reinforcement Learning with Once-per-Episode Feedback »
Niladri Chatterji · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2021 : Adversarial Examples in Random Deep Networks »
Peter Bartlett -
2021 Poster: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2021 Poster: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Spotlight: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Spotlight: Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity »
Dhruv Malik · Aldo Pacchiano · Vishwak Srinivasan · Yuanzhi Li -
2020 Poster: On Thompson Sampling with Langevin Algorithms »
Eric Mazumdar · Aldo Pacchiano · Yian Ma · Michael Jordan · Peter Bartlett -
2020 Poster: Accelerated Message Passing for Entropy-Regularized MAP Inference »
Jonathan Lee · Aldo Pacchiano · Peter Bartlett · Michael Jordan -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: Learning to Score Behaviors for Guided Policy Optimization »
Aldo Pacchiano · Jack Parker-Holder · Yunhao Tang · Krzysztof Choromanski · Anna Choromanska · Michael Jordan -
2020 Poster: Ready Policy One: World Building Through Active Learning »
Philip Ball · Jack Parker-Holder · Aldo Pacchiano · Krzysztof Choromanski · Stephen Roberts -
2019 Poster: Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2019 Poster: Scale-free adaptive planning for deterministic dynamics & discounted rewards »
Peter Bartlett · Victor Gabillon · Jennifer Healey · Michal Valko -
2019 Oral: Scale-free adaptive planning for deterministic dynamics & discounted rewards »
Peter Bartlett · Victor Gabillon · Jennifer Healey · Michal Valko -
2019 Oral: Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2019 Poster: Rademacher Complexity for Adversarially Robust Generalization »
Dong Yin · Kannan Ramchandran · Peter Bartlett -
2019 Oral: Rademacher Complexity for Adversarially Robust Generalization »
Dong Yin · Kannan Ramchandran · Peter Bartlett -
2018 Poster: Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks »
Peter Bartlett · Dave Helmbold · Phil Long -
2018 Poster: On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo »
Niladri Chatterji · Nicolas Flammarion · Yian Ma · Peter Bartlett · Michael Jordan -
2018 Oral: On the Theory of Variance Reduction for Stochastic Gradient Monte Carlo »
Niladri Chatterji · Nicolas Flammarion · Yian Ma · Peter Bartlett · Michael Jordan -
2018 Oral: Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks »
Peter Bartlett · Dave Helmbold · Phil Long -
2018 Poster: Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2018 Oral: Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates »
Dong Yin · Yudong Chen · Kannan Ramchandran · Peter Bartlett -
2017 Poster: Recovery Guarantees for One-hidden-layer Neural Networks »
Kai Zhong · Zhao Song · Prateek Jain · Peter Bartlett · Inderjit Dhillon -
2017 Talk: Recovery Guarantees for One-hidden-layer Neural Networks »
Kai Zhong · Zhao Song · Prateek Jain · Peter Bartlett · Inderjit Dhillon