Timezone: »
We study multi-objective reinforcement learning (RL) where an agent's reward is represented as a vector. In settings where an agent competes against opponents, its performance is measured by the distance of its average return vector to a target set. We develop statistically and computationally efficient algorithms to approach the associated target set. Our results extend Blackwell's approachability theorem~\citep{blackwell1956analog} to tabular RL, where strategic exploration becomes essential. The algorithms presented are adaptive; their guarantees hold even without Blackwell's approachability condition. If the opponents use fixed policies, we give an improved rate of approaching the target set while also tackling the more ambitious goal of simultaneously minimizing a scalar cost function. We discuss our analysis for this special case by relating our results to previous works on constrained RL. To our knowledge, this work provides the first provably efficient algorithms for vector-valued Markov games and our theoretical guarantees are near-optimal.
Author Information
Tiancheng Yu (MIT)
I am a 2nd-year PhD student in MIT EECS, advised by Prof. Suvrit Sra. Before MIT, I was an undergraduate student in Tsinghua University, major in Electronic Engineering and minor in Statistics.
Yi Tian (Massachusetts Institute of Technology)
Jingzhao Zhang (MIT)
Suvrit Sra (MIT)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Oral: Provably Efficient Algorithms for Multi-Objective Competitive RL »
Thu. Jul 22nd 01:00 -- 01:20 AM Room
More from the Same Authors
-
2021 : The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2023 Poster: Global optimality for Euclidean CCCP under Riemannian convexity »
Melanie Weber · Suvrit Sra -
2023 Poster: On the Training Instability of Shuffling SGD with Batch Normalization »
David X. Wu · Chulhee Yun · Suvrit Sra -
2022 : Sign and Basis Invariant Networks for Spectral Graph Representation Learning »
Derek Lim · Joshua Robinson · Lingxiao Zhao · Tess Smidt · Suvrit Sra · Haggai Maron · Stefanie Jegelka -
2022 Poster: Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity »
Jingzhao Zhang · Hongzhou Lin · Subhro Das · Suvrit Sra · Ali Jadbabaie -
2022 Poster: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Spotlight: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Spotlight: Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity »
Jingzhao Zhang · Hongzhou Lin · Subhro Das · Suvrit Sra · Ali Jadbabaie -
2022 Spotlight: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective »
Jingzhao Zhang · Haochuan Li · Suvrit Sra · Ali Jadbabaie -
2022 Poster: Understanding the unstable convergence of gradient descent »
Kwangjun Ahn · Jingzhao Zhang · Suvrit Sra -
2022 Spotlight: Understanding the unstable convergence of gradient descent »
Kwangjun Ahn · Jingzhao Zhang · Suvrit Sra -
2022 Spotlight: Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective »
Jingzhao Zhang · Haochuan Li · Suvrit Sra · Ali Jadbabaie -
2021 Poster: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Poster: Online Learning in Unknown Markov Games »
Yi Tian · Yuanhao Wang · Tiancheng Yu · Suvrit Sra -
2021 Spotlight: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Spotlight: Online Learning in Unknown Markov Games »
Yi Tian · Yuanhao Wang · Tiancheng Yu · Suvrit Sra -
2021 Poster: Three Operator Splitting with a Nonconvex Loss Function »
Alp Yurtsever · Varun Mangalick · Suvrit Sra -
2021 Spotlight: Three Operator Splitting with a Nonconvex Loss Function »
Alp Yurtsever · Varun Mangalick · Suvrit Sra -
2020 : Short Talk 5 - Near-Optimal Reinforcement Learning with Self-Play »
Tiancheng Yu -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Strength from Weakness: Fast Learning Using Weak Supervision »
Joshua Robinson · Stefanie Jegelka · Suvrit Sra -
2020 Poster: Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions »
Jingzhao Zhang · Hongzhou Lin · Stefanie Jegelka · Suvrit Sra · Ali Jadbabaie -
2020 Poster: Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition »
Chi Jin · Tiancheng Jin · Haipeng Luo · Suvrit Sra · Tiancheng Yu -
2019 Poster: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Oral: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Poster: Random Shuffling Beats SGD after Finite Epochs »
Jeff HaoChen · Suvrit Sra -
2019 Oral: Random Shuffling Beats SGD after Finite Epochs »
Jeff HaoChen · Suvrit Sra -
2019 Poster: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher -
2019 Oral: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher