Timezone: »
Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.
Author Information
Alessandro Sordoni (Microsoft Research)
Nouha Dziri (University of Alberta)
I’m a PhD student at the University of Alberta where I investigate generative deep learning models and natural language processing methods. In particular, my research focuses on developing data-driven approaches for computational natural language understanding, primarily in the context of enabling machines to converse with humans in natural language. Further, I’m interested in exploring different methods for the fiendishly difficult problem of evaluating conversational AI.
Hannes Schulz (Microsoft)
Geoff Gordon (Carnegie Mellon University)
Philip Bachman (Microsoft Research)
Remi Tachet des Combes (Microsoft Research Montreal)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Decomposed Mutual Information Estimation for Contrastive Representation Learning »
Fri. Jul 23rd 04:00 -- 06:00 AM Room Virtual
More from the Same Authors
-
2023 : Video-Guided Skill Discovery »
Manan Tomar · Dibya Ghosh · Vivek Myers · Anca Dragan · Matthew Taylor · Philip Bachman · Sergey Levine -
2021 Poster: Understanding and Mitigating Accuracy Disparity in Regression »
Jianfeng Chi · Yuan Tian · Geoff Gordon · Han Zhao -
2021 Spotlight: Understanding and Mitigating Accuracy Disparity in Regression »
Jianfeng Chi · Yuan Tian · Geoff Gordon · Han Zhao -
2021 Poster: Information Obfuscation of Graph Neural Networks »
Peiyuan Liao · Han Zhao · Keyulu Xu · Tommi Jaakkola · Geoff Gordon · Stefanie Jegelka · Ruslan Salakhutdinov -
2021 Spotlight: Information Obfuscation of Graph Neural Networks »
Peiyuan Liao · Han Zhao · Keyulu Xu · Tommi Jaakkola · Geoff Gordon · Stefanie Jegelka · Ruslan Salakhutdinov -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Convergence Properties of Neural Networks on Separable Data »
Remi Tachet des Combes -
2019 Poster: Safe Policy Improvement with Baseline Bootstrapping »
Romain Laroche · Paul TRICHELAIR · Remi Tachet des Combes -
2019 Poster: On Learning Invariant Representations for Domain Adaptation »
Han Zhao · Remi Tachet des Combes · Kun Zhang · Geoff Gordon -
2019 Oral: On Learning Invariant Representations for Domain Adaptation »
Han Zhao · Remi Tachet des Combes · Kun Zhang · Geoff Gordon -
2019 Oral: Safe Policy Improvement with Baseline Bootstrapping »
Romain Laroche · Paul TRICHELAIR · Remi Tachet des Combes -
2018 Poster: Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data »
Amjad Almahairi · Sai Rajeswar · Alessandro Sordoni · Philip Bachman · Aaron Courville -
2018 Oral: Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data »
Amjad Almahairi · Sai Rajeswar · Alessandro Sordoni · Philip Bachman · Aaron Courville -
2018 Poster: Recurrent Predictive State Policy Networks »
Ahmed Hefny · Zita Marinho · Wen Sun · Siddhartha Srinivasa · Geoff Gordon -
2018 Oral: Recurrent Predictive State Policy Networks »
Ahmed Hefny · Zita Marinho · Wen Sun · Siddhartha Srinivasa · Geoff Gordon -
2017 Poster: Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction »
Wen Sun · Arun Venkatraman · Geoff Gordon · Byron Boots · Drew Bagnell -
2017 Talk: Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction »
Wen Sun · Arun Venkatraman · Geoff Gordon · Byron Boots · Drew Bagnell -
2017 Poster: Learning Algorithms for Active Learning »
Philip Bachman · Alessandro Sordoni · Adam Trischler -
2017 Talk: Learning Algorithms for Active Learning »
Philip Bachman · Alessandro Sordoni · Adam Trischler