Timezone: »
The score function estimator is widely used for estimating gradients of stochastic objectives in stochastic computation graphs (SCG), eg., in reinforcement learning and meta-learning. While deriving the first-order gradient estimators by differentiating a surrogate loss (SL) objective is computationally and conceptually simple, using the same approach for higher-order derivatives is more challenging. Firstly, analytically deriving and implementing such estimators is laborious and not compliant with automatic differentiation. Secondly, repeatedly applying SL to construct new objectives for each order derivative involves increasingly cumbersome graph manipulations. Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives. To address all these shortcomings in a unified way, we introduce DiCE, which provides a single objective that can be differentiated repeatedly, generating correct estimators of derivatives of any order in SCGs. Unlike SL, DiCE relies on automatic differentiation for performing the requisite graph manipulations. We verify the correctness of DiCE both through a proof and numerical evaluation of the DiCE derivative estimates. We also use DiCE to propose and evaluate a novel approach for multi-agent learning. Our code is available at https://goo.gl/xkkGxN.
Author Information
Jakob Foerster (Facebook AI Research)
Gregory Farquhar (University of Oxford)
Maruan Al-Shedivat (Carnegie Mellon University)
Tim Rocktäschel (University of Oxford)
Eric Xing (Petuum Inc. and CMU)
Shimon Whiteson (University of Oxford)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: DiCE: The Infinitely Differentiable Monte Carlo Estimator »
Wed. Jul 11th 04:15 -- 07:00 PM Room Hall B #102
More from the Same Authors
-
2021 : Towards Principled Disentanglement for Domain Generalization »
Hanlin Zhang · Yi-Fan Zhang · Weiyang Liu · Adrian Weller · Bernhard Schölkopf · Eric Xing -
2023 : Counterfactual Generation with Identifiability Guarantees »
Hanqi Yan · Lingjing Kong · Lin Gui · Yuejie Chi · Eric Xing · Yulan He · Kun Zhang -
2023 : Identification of Nonlinear Latent Hierarchical Causal Models »
Lingjing Kong · Biwei Huang · Feng Xie · Eric Xing · Yuejie Chi · Kun Zhang -
2023 : Making Scalable Meta Learning Practical »
Sang Keun Choe · Sanket Vaibhav Mehta · Hwijeen Ahn · Willie Neiswanger · Pengtao Xie · Emma Strubell · Eric Xing -
2022 Workshop: The First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward »
Huaxiu Yao · Hugo Larochelle · Percy Liang · Colin Raffel · Jian Tang · Ying WEI · Saining Xie · Eric Xing · Chelsea Finn -
2022 Poster: SDQ: Stochastic Differentiable Quantization with Mixed Precision »
Xijie Huang · Zhiqiang Shen · Shichao Li · Zechun Liu · Hu Xianghong · Jeffry Wicaksana · Eric Xing · Kwang-Ting Cheng -
2022 Poster: Communicating via Markov Decision Processes »
Samuel Sokota · Christian Schroeder · Maximilian Igl · Luisa Zintgraf · Phil Torr · Martin Strohmeier · Zico Kolter · Shimon Whiteson · Jakob Foerster -
2022 Spotlight: SDQ: Stochastic Differentiable Quantization with Mixed Precision »
Xijie Huang · Zhiqiang Shen · Shichao Li · Zechun Liu · Hu Xianghong · Jeffry Wicaksana · Eric Xing · Kwang-Ting Cheng -
2022 Spotlight: Communicating via Markov Decision Processes »
Samuel Sokota · Christian Schroeder · Maximilian Igl · Luisa Zintgraf · Phil Torr · Martin Strohmeier · Zico Kolter · Shimon Whiteson · Jakob Foerster -
2022 Poster: Generalized Beliefs for Cooperative AI »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2022 Spotlight: Generalized Beliefs for Cooperative AI »
Darius Muglich · Luisa Zintgraf · Christian Schroeder de Witt · Shimon Whiteson · Jakob Foerster -
2021 Workshop: Self-Supervised Learning for Reasoning and Perception »
Pengtao Xie · Shanghang Zhang · Ishan Misra · Pulkit Agrawal · Katerina Fragkiadaki · Ruisi Zhang · Tassilo Klein · Asli Celikyilmaz · Mihaela van der Schaar · Eric Xing -
2021 : Invited Talk: Eric P. Xing. A Data-Centric View for Composable Natural Language Processing. »
Eric Xing -
2021 Workshop: Interpretable Machine Learning in Healthcare »
Yuyin Zhou · Xiaoxiao Li · Vicky Yao · Pengtao Xie · DOU QI · Nicha Dvornek · Julia Schnabel · Judy Wawira · Yifan Peng · Ronald Summers · Alan Karthikesalingam · Lei Xing · Eric Xing -
2021 Poster: Average-Reward Off-Policy Policy Evaluation with Function Approximation »
Shangtong Zhang · Yi Wan · Richard Sutton · Shimon Whiteson -
2021 Poster: Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning »
Luisa Zintgraf · Leo Feng · Cong Lu · Maximilian Igl · Kristian Hartikainen · Katja Hofmann · Shimon Whiteson -
2021 Spotlight: Average-Reward Off-Policy Policy Evaluation with Function Approximation »
Shangtong Zhang · Yi Wan · Richard Sutton · Shimon Whiteson -
2021 Spotlight: Breaking the Deadly Triad with a Target Network »
Shangtong Zhang · Hengshuai Yao · Shimon Whiteson -
2021 Spotlight: Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning »
Luisa Zintgraf · Leo Feng · Cong Lu · Maximilian Igl · Kristian Hartikainen · Katja Hofmann · Shimon Whiteson -
2021 Poster: Breaking the Deadly Triad with a Target Network »
Shangtong Zhang · Hengshuai Yao · Shimon Whiteson -
2021 Poster: Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning »
Shariq Iqbal · Christian Schroeder · Bei Peng · Wendelin Boehmer · Shimon Whiteson · Fei Sha -
2021 Oral: Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning »
Shariq Iqbal · Christian Schroeder · Bei Peng · Wendelin Boehmer · Shimon Whiteson · Fei Sha -
2021 Poster: Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviychuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 Poster: UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning »
Tarun Gupta · Anuj Mahajan · Bei Peng · Wendelin Boehmer · Shimon Whiteson -
2021 Spotlight: Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning »
Anuj Mahajan · Mikayel Samvelyan · Lei Mao · Viktor Makoviychuk · Animesh Garg · Jean Kossaifi · Shimon Whiteson · Yuke Zhu · Anima Anandkumar -
2021 Spotlight: UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning »
Tarun Gupta · Anuj Mahajan · Bei Peng · Wendelin Boehmer · Shimon Whiteson -
2020 Poster: Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation »
Shangtong Zhang · Bo Liu · Hengshuai Yao · Shimon Whiteson -
2020 Poster: Deep Coordination Graphs »
Wendelin Boehmer · Vitaly Kurin · Shimon Whiteson -
2020 Poster: GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values »
Shangtong Zhang · Bo Liu · Shimon Whiteson -
2019 Workshop: Adaptive and Multitask Learning: Algorithms & Systems »
Maruan Al-Shedivat · Anthony Platanios · Otilia Stretcu · Jacob Andreas · Ameet Talwalkar · Rich Caruana · Tom Mitchell · Eric Xing -
2019 Workshop: Learning and Reasoning with Graph-Structured Representations »
Ethan Fetaya · Zhiting Hu · Thomas Kipf · Yujia Li · Xiaodan Liang · Renjie Liao · Raquel Urtasun · Hao Wang · Max Welling · Eric Xing · Richard Zemel -
2019 Poster: Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Francis Song · Edward Hughes · Neil Burch · Iain Dunning · Shimon Whiteson · Matthew Botvinick · Michael Bowling -
2019 Oral: Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Francis Song · Edward Hughes · Neil Burch · Iain Dunning · Shimon Whiteson · Matthew Botvinick · Michael Bowling -
2019 Poster: A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs »
Jingkai Mao · Jakob Foerster · Tim Rocktäschel · Maruan Al-Shedivat · Gregory Farquhar · Shimon Whiteson -
2019 Poster: Theoretically Principled Trade-off between Robustness and Accuracy »
Hongyang Zhang · Yaodong Yu · Jiantao Jiao · Eric Xing · Laurent El Ghaoui · Michael Jordan -
2019 Poster: Fast Context Adaptation via Meta-Learning »
Luisa Zintgraf · Kyriacos Shiarlis · Vitaly Kurin · Katja Hofmann · Shimon Whiteson -
2019 Oral: A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs »
Jingkai Mao · Jakob Foerster · Tim Rocktäschel · Maruan Al-Shedivat · Gregory Farquhar · Shimon Whiteson -
2019 Oral: Fast Context Adaptation via Meta-Learning »
Luisa Zintgraf · Kyriacos Shiarlis · Vitaly Kurin · Katja Hofmann · Shimon Whiteson -
2019 Oral: Theoretically Principled Trade-off between Robustness and Accuracy »
Hongyang Zhang · Yaodong Yu · Jiantao Jiao · Eric Xing · Laurent El Ghaoui · Michael Jordan -
2019 Poster: Fingerprint Policy Optimisation for Robust Reinforcement Learning »
Supratik Paul · Michael A Osborne · Shimon Whiteson -
2019 Oral: Fingerprint Policy Optimisation for Robust Reinforcement Learning »
Supratik Paul · Michael A Osborne · Shimon Whiteson -
2018 Poster: Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis »
Pengtao Xie · Wei Wu · Yichen Zhu · Eric Xing -
2018 Poster: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Poster: Fourier Policy Gradients »
Mattie Fellows · Kamil Ciosek · Shimon Whiteson -
2018 Oral: Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis »
Pengtao Xie · Wei Wu · Yichen Zhu · Eric Xing -
2018 Oral: Fourier Policy Gradients »
Mattie Fellows · Kamil Ciosek · Shimon Whiteson -
2018 Oral: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Poster: The Mechanics of n-Player Differentiable Games »
David Balduzzi · Sebastien Racaniere · James Martens · Jakob Foerster · Karl Tuyls · Thore Graepel -
2018 Poster: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Mikayel Samvelyan · Christian Schroeder · Gregory Farquhar · Jakob Foerster · Shimon Whiteson -
2018 Poster: Learning Policy Representations in Multiagent Systems »
Aditya Grover · Maruan Al-Shedivat · Jayesh K. Gupta · Yura Burda · Harrison Edwards -
2018 Poster: Deep Variational Reinforcement Learning for POMDPs »
Maximilian Igl · Luisa Zintgraf · Tuan Anh Le · Frank Wood · Shimon Whiteson -
2018 Oral: Deep Variational Reinforcement Learning for POMDPs »
Maximilian Igl · Luisa Zintgraf · Tuan Anh Le · Frank Wood · Shimon Whiteson -
2018 Oral: The Mechanics of n-Player Differentiable Games »
David Balduzzi · Sebastien Racaniere · James Martens · Jakob Foerster · Karl Tuyls · Thore Graepel -
2018 Oral: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning »
Tabish Rashid · Mikayel Samvelyan · Christian Schroeder · Gregory Farquhar · Jakob Foerster · Shimon Whiteson -
2018 Oral: Learning Policy Representations in Multiagent Systems »
Aditya Grover · Maruan Al-Shedivat · Jayesh K. Gupta · Yura Burda · Harrison Edwards -
2018 Poster: Nonoverlap-Promoting Variable Selection »
Pengtao Xie · Hongbao Zhang · Yichen Zhu · Eric Xing -
2018 Poster: TACO: Learning Task Decomposition via Temporal Alignment for Control »
Kyriacos Shiarlis · Markus Wulfmeier · Sasha Salter · Shimon Whiteson · Ingmar Posner -
2018 Poster: Gated Path Planning Networks »
Lisa Lee · Emilio Parisotto · Devendra Singh Chaplot · Eric Xing · Ruslan Salakhutdinov -
2018 Oral: Gated Path Planning Networks »
Lisa Lee · Emilio Parisotto · Devendra Singh Chaplot · Eric Xing · Ruslan Salakhutdinov -
2018 Oral: TACO: Learning Task Decomposition via Temporal Alignment for Control »
Kyriacos Shiarlis · Markus Wulfmeier · Sasha Salter · Shimon Whiteson · Ingmar Posner -
2018 Oral: Nonoverlap-Promoting Variable Selection »
Pengtao Xie · Hongbao Zhang · Yichen Zhu · Eric Xing -
2017 Poster: Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Nantas Nardelli · Gregory Farquhar · Triantafyllos Afouras · Phil Torr · Pushmeet Kohli · Shimon Whiteson -
2017 Poster: Toward Controlled Generation of Text »
Zhiting Hu · Zichao Yang · Xiaodan Liang · Ruslan Salakhutdinov · Eric Xing -
2017 Talk: Toward Controlled Generation of Text »
Zhiting Hu · Zichao Yang · Xiaodan Liang · Ruslan Salakhutdinov · Eric Xing -
2017 Talk: Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning »
Jakob Foerster · Nantas Nardelli · Gregory Farquhar · Triantafyllos Afouras · Phil Torr · Pushmeet Kohli · Shimon Whiteson -
2017 Poster: Uncorrelation and Evenness: a New Diversity-Promoting Regularizer »
Pengtao Xie · Aarti Singh · Eric Xing -
2017 Poster: Learning Latent Space Models with Angular Constraints »
Pengtao Xie · Yuntian Deng · Yi Zhou · Abhimanu Kumar · Yaoliang Yu · James Zou · Eric Xing -
2017 Poster: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Talk: Learning Latent Space Models with Angular Constraints »
Pengtao Xie · Yuntian Deng · Yi Zhou · Abhimanu Kumar · Yaoliang Yu · James Zou · Eric Xing -
2017 Talk: Uncorrelation and Evenness: a New Diversity-Promoting Regularizer »
Pengtao Xie · Aarti Singh · Eric Xing -
2017 Talk: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Poster: Post-Inference Prior Swapping »
Willie Neiswanger · Eric Xing -
2017 Talk: Post-Inference Prior Swapping »
Willie Neiswanger · Eric Xing