Timezone: »
We study privacy-preserving exploration in sequential decision-making for environments that rely on sensitive data such as medical records. In particular, we focus on solving the problem of reinforcement learning (RL) subject to the constraint of (joint) differential privacy in the linear MDP setting, where both dynamics and rewards are given by linear functions. Prior work on this problem due to (Luyo et al., 2021) achieves a regret rate that has a dependence of O(K^{3/5}) on the number of episodes K. We provide a private algorithm with an improved regret rate with an optimal dependence of O(√K) on the number of episodes. The key recipe for our stronger regret guarantee is the adaptivity in the policy update schedule, in which an update only occurs when sufficient changes in the data are detected. As a result, our algorithm benefits from low switching cost and only performs O(log(K)) updates, which greatly reduces the amount of privacy noise. Finally, in the most prevalent privacy regimes where the privacy parameter \epsilon is a constant, our algorithm incurs negligible privacy cost—in comparison with the existing non-private regret bounds, the additional regret due to privacy appears in lower-order terms.
Author Information
Dung Ngo (University of Minnesota)
Giuseppe Vietri (University of Minnesota)
Steven Wu (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Improved Regret for Differentially Private Exploration in Linear MDP »
Wed. Jul 20th through Thu the 21st Room Hall E #1020
More from the Same Authors
-
2020 : Contributed Talk: Incentivizing Bandit Exploration:Recommendations as Instruments »
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu -
2021 : Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
· Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Stateful Strategic Regression »
Keegan Harris · Hoda Heidari · Steven Wu -
2021 : Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods »
Terrance Liu · Giuseppe Vietri · Steven Wu -
2021 : Private Multi-Task Learning: Formulation and Applications to Federated Learning »
Shengyuan Hu · Steven Wu · Virginia Smith -
2021 : Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods »
Terrance Liu · Giuseppe Vietri · Steven Wu -
2021 : Understanding Clipped FedAvg: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Steven Wu · Mingyi Hong -
2021 : Improved Privacy Filters and Odometers: Time-Uniform Bounds in Privacy Composition »
Justin Whitehouse · Aaditya Ramdas · Ryan Rogers · Steven Wu -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Stateful Strategic Regression »
Keegan Harris · Hoda Heidari · Steven Wu -
2021 : Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2021 : Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2021 : Scalable Algorithms for Nonlinear Causal Inference »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2021 : Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 : Meta-Learning Adversarial Bandits »
Nina Balcan · Keegan Harris · Mikhail Khodak · Steven Wu -
2023 : Complementing a Policy with a Different Observation Space »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2023 : Adaptive Principal Component Regression with Applications to Panel Data »
Anish Agarwal · Keegan Harris · Justin Whitehouse · Steven Wu -
2023 : Strategyproof Decision-Making in Panel Data Settings and Beyond »
Keegan Harris · Anish Agarwal · Chara Podimata · Steven Wu -
2023 : Strategic Apple Tasting »
Keegan Harris · Chara Podimata · Steven Wu -
2023 : Strategyproof Decision-Making in Panel Data Settings and Beyond »
Keegan Harris · Anish Agarwal · Chara Podimata · Steven Wu -
2023 : Complementing a Policy with a Different Observation Space »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2023 : Learning Shared Safety Constraints from Multi-task Demonstrations »
Konwoo Kim · Gokul Swamy · Zuxin Liu · Ding Zhao · Sanjiban Choudhury · Steven Wu -
2023 : Strategic Apple Tasting »
Keegan Harris · Chara Podimata · Steven Wu -
2023 : Learning Shared Safety Constraints from Multi-task Demonstrations »
Konwoo Kim · Gokul Swamy · Zuxin Liu · Ding Zhao · Sanjiban Choudhury · Steven Wu -
2023 Poster: Fully-Adaptive Composition in Differential Privacy »
Justin Whitehouse · Aaditya Ramdas · Ryan Rogers · Steven Wu -
2023 Oral: Nonparametric Extensions of Randomized Response for Private Confidence Sets »
Ian Waudby-Smith · Steven Wu · Aaditya Ramdas -
2023 Poster: Nonparametric Extensions of Randomized Response for Private Confidence Sets »
Ian Waudby-Smith · Steven Wu · Aaditya Ramdas -
2023 Poster: Inverse Reinforcement Learning without Reinforcement Learning »
Gokul Swamy · David Wu · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2023 Poster: Generating Private Synthetic Data with Genetic Algorithms »
Terrance Liu · Jingwu Tang · Giuseppe Vietri · Steven Wu -
2022 Poster: Information Discrepancy in Strategic Learning »
Yahav Bechavod · Chara Podimata · Steven Wu · Juba Ziani -
2022 Poster: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Poster: Causal Imitation Learning under Temporally Correlated Noise »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 Spotlight: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Spotlight: Information Discrepancy in Strategic Learning »
Yahav Bechavod · Chara Podimata · Steven Wu · Juba Ziani -
2022 Oral: Causal Imitation Learning under Temporally Correlated Noise »
Gokul Swamy · Sanjiban Choudhury · James Bagnell · Steven Wu -
2022 Poster: Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2022 Poster: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2022 Poster: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Spotlight: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Spotlight: Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses »
Keegan Harris · Dung Ngo · Logan Stapleton · Hoda Heidari · Steven Wu -
2022 Spotlight: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2021 Poster: Leveraging Public Data for Practical Private Query Release »
Terrance Liu · Giuseppe Vietri · Thomas Steinke · Jonathan Ullman · Steven Wu -
2021 Spotlight: Leveraging Public Data for Practical Private Query Release »
Terrance Liu · Giuseppe Vietri · Thomas Steinke · Jonathan Ullman · Steven Wu -
2021 Poster: Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2021 Spotlight: Of Moments and Matching: A Game-Theoretic Framework for Closing the Imitation Gap »
Gokul Swamy · Sanjiban Choudhury · J. Bagnell · Steven Wu -
2021 Poster: Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju -
2021 Poster: Incentivizing Compliance with Algorithmic Instruments »
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu -
2021 Spotlight: Incentivizing Compliance with Algorithmic Instruments »
Dung Ngo · Logan Stapleton · Vasilis Syrgkanis · Steven Wu -
2021 Spotlight: Towards the Unification and Robustness of Perturbation and Gradient Based Explanations »
Sushant Agarwal · Shahin Jabbari · Chirag Agarwal · Sohini Upadhyay · Steven Wu · Hima Lakkaraju -
2020 Poster: New Oracle-Efficient Algorithms for Private Synthetic Data Release »
Giuseppe Vietri · Grace Tian · Mark Bun · Thomas Steinke · Steven Wu -
2020 Poster: Oracle Efficient Private Non-Convex Optimization »
Seth Neel · Aaron Roth · Giuseppe Vietri · Steven Wu -
2020 Poster: Private Reinforcement Learning with PAC and Regret Guarantees »
Giuseppe Vietri · Borja de Balle Pigem · Akshay Krishnamurthy · Steven Wu