Timezone: »
Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL). We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)--a strong variant of differential privacy for settings where each user receives their own sets of output (e.g., policy recommendations). We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee. Our algorithm only pays for a moderate privacy cost on exploration: in comparison to the non-private bounds, the privacy parameter only appears in lower-order terms. Finally, we present lower bounds on sample complexity and regret for reinforcement learning subject to JDP.
Author Information
Giuseppe Vietri (University of Minnesota)
Borja de Balle Pigem (Amazon Research)
Akshay Krishnamurthy (Microsoft Research)
Steven Wu (University of Minnesota)
More from the Same Authors
-
2020 Poster: Doubly robust off-policy evaluation with shrinkage »
Yi Su · Maria Dimakopoulou · Akshay Krishnamurthy · Miroslav Dudik -
2020 Poster: Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning »
Dipendra Misra · Mikael Henaff · Akshay Krishnamurthy · John Langford -
2020 Poster: New Oracle-Efficient Algorithms for Private Synthetic Data Release »
Giuseppe Vietri · Grace Tian · Mark Bun · Thomas Steinke · Steven Wu -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis »
Vidyashankar Sivakumar · Steven Wu · Arindam Banerjee -
2020 Poster: Privately Learning Markov Random Fields »
Huanyu Zhang · Gautam Kamath · Janardhan Kulkarni · Steven Wu -
2020 Poster: Adaptive Estimator Selection for Off-Policy Evaluation »
Yi Su · Pavithra Srinath · Akshay Krishnamurthy -
2020 Poster: Private Query Release Assisted by Public Data »
Raef Bassily · Albert Cheu · Shay Moran · Aleksandar Nikolov · Jonathan Ullman · Steven Wu -
2020 Poster: Oracle Efficient Private Non-Convex Optimization »
Seth Neel · Aaron Roth · Giuseppe Vietri · Steven Wu -
2019 Poster: Fair Regression: Quantitative Definitions and Reduction-Based Algorithms »
Alekh Agarwal · Miroslav Dudik · Steven Wu -
2019 Oral: Fair Regression: Quantitative Definitions and Reduction-Based Algorithms »
Alekh Agarwal · Miroslav Dudik · Steven Wu -
2019 Poster: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Poster: Orthogonal Random Forest for Causal Inference »
Miruna Oprescu · Vasilis Syrgkanis · Steven Wu -
2019 Oral: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Oral: Orthogonal Random Forest for Causal Inference »
Miruna Oprescu · Vasilis Syrgkanis · Steven Wu -
2019 Poster: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Poster: Locally Private Bayesian Inference for Count Models »
Aaron Schein · Steven Wu · Alexandra Schofield · Mingyuan Zhou · Hanna Wallach -
2019 Oral: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Oral: Locally Private Bayesian Inference for Count Models »
Aaron Schein · Steven Wu · Alexandra Schofield · Mingyuan Zhou · Hanna Wallach -
2018 Poster: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2018 Oral: Semiparametric Contextual Bandits »
Akshay Krishnamurthy · Steven Wu · Vasilis Syrgkanis -
2018 Poster: Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising »
Borja de Balle Pigem · Yu-Xiang Wang -
2018 Oral: Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising »
Borja de Balle Pigem · Yu-Xiang Wang -
2017 Poster: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Poster: Spectral Learning from a Single Trajectory under Finite-State Policies »
Borja de Balle Pigem · Odalric Maillard -
2017 Talk: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Talk: Spectral Learning from a Single Trajectory under Finite-State Policies »
Borja de Balle Pigem · Odalric Maillard -
2017 Poster: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford -
2017 Talk: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford