Timezone: »
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant. Challenges emerge with non-stationary training data streams such as continual learning. One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning. Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks. In the present work, we propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes. Our paradigm will be to encode; process the representation via a discrete bottleneck; and decode. Here, the input is fed to the pre-trained encoder, the output of the encoder is used to select the nearest keys, and the corresponding values are fed to the decoder to solve the current task. The model can only fetch and re-use a sparse number of these key-value pairs during inference, enabling localized and context-dependent model updates. We theoretically investigate the ability of the discrete key-value bottleneck to minimize the effect of learning under distribution shifts and show that it reduces the complexity of the hypothesis class. We empirically verify the proposed method under challenging class-incremental learning scenarios and show that the proposed model --- without any task boundaries --- reduces catastrophic forgetting across a wide variety of pre-trained models, outperforming relevant baselines on this task.
Author Information
Frederik Träuble (MPI for Intelligent Systems)
Anirudh Goyal (Université de Montréal)
Nasim Rahaman (Max Planck Institute for Intelligent Systems)
Michael Mozer (Google Research)
Kenji Kawaguchi (NUS)
Yoshua Bengio (Mila - Quebec AI Institute)
Bernhard Schölkopf (MPI for Intelligent Systems Tübingen, Germany)
Bernhard Scholkopf received degrees in mathematics (London) and physics (Tubingen), and a doctorate in computer science from the Technical University Berlin. He has researched at AT&T Bell Labs, at GMD FIRST, Berlin, at the Australian National University, Canberra, and at Microsoft Research Cambridge (UK). In 2001, he was appointed scientific member of the Max Planck Society and director at the MPI for Biological Cybernetics; in 2010 he founded the Max Planck Institute for Intelligent Systems. For further information, see www.kyb.tuebingen.mpg.de/~bs.
More from the Same Authors
-
2021 : On the Fairness of Causal Algorithmic Recourse »
Julius von Kügelgen · Amir-Hossein Karimi · Umang Bhatt · Isabel Valera · Adrian Weller · Bernhard Schölkopf · Amir-Hossein Karimi -
2021 : Algorithmic Recourse in Partially and Fully Confounded Settings Through Bounding Counterfactual Effects »
Julius von Kügelgen · Nikita Agarwal · Jakob Zeitler · Afsaneh Mastouri · Bernhard Schölkopf -
2021 : Gradient Starvation: A Learning Proclivity in Neural Networks »
Mohammad Pezeshki · Sékou-Oumar Kaba · Yoshua Bengio · Aaron Courville · Doina Precup · Guillaume Lajoie -
2021 : Epoch-Wise Double Descent: A Theory of Multi-scale Feature Learning Dynamics »
Mohammad Pezeshki · Amartya Mitra · Yoshua Bengio · Guillaume Lajoie -
2021 : Representation Learning for Out-of-distribution Generalization in Downstream Tasks »
Frederik Träuble · Andrea Dittadi · Manuel Wuthrich · Felix Widmaier · Peter V Gehler · Ole Winther · Francesco Locatello · Olivier Bachem · Bernhard Schölkopf · Stefan Bauer -
2021 : Exploration-Driven Representation Learning in Reinforcement Learning »
Akram Erraqabi · Mingde Zhao · Marlos C. Machado · Yoshua Bengio · Sainbayar Sukhbaatar · Ludovic Denoyer · Alessandro Lazaric -
2021 : Representation Learning for Out-of-distribution Generalization in Downstream Tasks »
Frederik Träuble · Andrea Dittadi · Manuel Wüthrich · Felix Widmaier · Peter Gehler · Ole Winther · Francesco Locatello · Olivier Bachem · Bernhard Schölkopf · Stefan Bauer -
2021 : Variational Causal Networks: Approximate Bayesian Inference over Causal Structures »
Yashas Annadani · Jonas Rothfuss · Alexandre Lacoste · Nino Scherrer · Anirudh Goyal · Yoshua Bengio · Stefan Bauer -
2021 : Lie interventions in complex systems with cycles »
Michel Besserve · Bernhard Schölkopf -
2022 : Learning to induce causal structure »
Rosemary Nan Ke · Silvia Chiappa · Jane Wang · Jorg Bornschein · Anirudh Goyal · Melanie Rey · Matthew Botvinick · Theophane Weber · Michael Mozer · Danilo J. Rezende -
2022 : On the Generalization and Adaption Performance of Causal Models »
Nino Scherrer · Anirudh Goyal · Stefan Bauer · Yoshua Bengio · Rosemary Nan Ke -
2022 : MAgNet: Mesh Agnostic Neural PDE Solver »
Oussama Boussif · Yoshua Bengio · Loubna Benabbou · Dan Assouline -
2022 : Maximum Mean Discrepancy Distributionally Robust Nonlinear Chance-Constrained Optimization with Finite-Sample Guarantee »
Yassine Nemmour · Heiner Kremer · Bernhard Schölkopf · Jia-Jie Zhu -
2023 : Spuriosity Didn’t Kill the Classifier: Using Invariant Predictions to Harness Spurious Features »
Cian Eastwood · Shashank Singh · Andrei Nicolicioiu · Marin Vlastelica · Julius von Kügelgen · Bernhard Schölkopf -
2023 : Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks »
Yuzhen Mao · Zhun Deng · Huaxiu Yao · Ting Ye · Kenji Kawaguchi · James Zou -
2023 : Leveraging sparse and shared feature activations for disentangled representation learning »
Marco Fumero · Florian Wenzel · Luca Zancato · Alessandro Achille · Emanuele Rodola · Stefano Soatto · Bernhard Schölkopf · Francesco Locatello -
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : Improving and Generalizing Flow-Based Generative Models with Minibatch Optimal Transport »
Alexander Tong · Nikolay Malkin · Guillaume Huguet · Yanlei Zhang · Jarrid Rector-Brooks · Kilian Fatras · Guy Wolf · Yoshua Bengio -
2023 : Simulation-Free Schrödinger Bridges via Score and Flow Matching »
Alexander Tong · Nikolay Malkin · Kilian Fatras · Lazar Atanackovic · Yanlei Zhang · Guillaume Huguet · Guy Wolf · Yoshua Bengio -
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning »
Rim Assouel · Pau Rodriguez · Perouz Taslakian · David Vazquez · Yoshua Bengio -
2023 : Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation »
Chris Emezue · Alexandre Drouin · Tristan Deleu · Stefan Bauer · Yoshua Bengio -
2023 : Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network »
Tristan Deleu · Mizu Nishikawa-Toomey · Jithendaraa Subramanian · Nikolay Malkin · Laurent Charlin · Yoshua Bengio -
2023 : Flow Matching for Scalable Simulation-Based Inference »
Jonas Wildberger · Maximilian Dax · Simon Buchholz · Stephen R. Green · Jakob Macke · Bernhard Schölkopf -
2023 : Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 : BatchGFN: Generative Flow Networks for Batch Active Learning »
Shreshth Malik · Salem Lahlou · Andrew Jesson · Moksh Jain · Nikolay Malkin · Tristan Deleu · Yoshua Bengio · Yarin Gal -
2023 : Thompson Sampling for Improved Exploration in GFlowNets »
Jarrid Rector-Brooks · Kanika Madan · Moksh Jain · Maksym Korablyov · Chenghao Liu · Sarath Chandar · Nikolay Malkin · Yoshua Bengio -
2023 : GFlowNets for Causal Discovery: an Overview »
Dragos Cristian Manta · Edward Hu · Yoshua Bengio -
2023 : Constant Memory Attention Block »
Leo Feng · Frederick Tung · Hossein Hajimirsadeghi · Yoshua Bengio · Mohamed Osama Ahmed -
2023 : What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context? »
Oussama Boussif · Ghait Boukachab · Dan Assouline · Stefano Massaroli · Tianle Yuan · Loubna Benabbou · Yoshua Bengio -
2023 : GFlowNets for Causal Discovery: an Overview »
Dragos Cristian Manta · Edward Hu · Yoshua Bengio -
2023 : Flow Matching for Scalable Simulation-Based Inference »
Jonas Wildberger · Maximilian Dax · Simon Buchholz · Stephen R. Green · Jakob Macke · Bernhard Schölkopf -
2023 : Desiderata for Representation Learning from Identifiability, Disentanglement, and Group-Structuredness »
Hamza Keurti · Patrik Reizinger · Bernhard Schölkopf · Wieland Brendel -
2023 Workshop: Structured Probabilistic Inference and Generative Modeling »
Dinghuai Zhang · Yuanqi Du · Chenlin Meng · Shawn Tan · Yingzhen Li · Max Welling · Yoshua Bengio -
2023 : Opening Remark »
Dinghuai Zhang · Yuanqi Du · Chenlin Meng · Shawn Tan · Yingzhen Li · Max Welling · Yoshua Bengio -
2023 Oral: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: Provably Learning Object-Centric Representations »
Jack Brady · Roland S. Zimmermann · Yash Sharma · Bernhard Schölkopf · Julius von Kügelgen · Wieland Brendel -
2023 Poster: Equivariance with Learned Canonicalization Functions »
Sékou-Oumar Kaba · Arnab Kumar Mondal · Yan Zhang · Yoshua Bengio · Siamak Ravanbakhsh -
2023 Poster: GFlowOut: Dropout with Generative Flow Networks »
Dianbo Liu · Moksh Jain · Bonaventure F. P. Dossou · Qianli Shen · Salem Lahlou · Anirudh Goyal · Nikolay Malkin · Chris Emezue · Dinghuai Zhang · Nadhir Hassen · Xu Ji · Kenji Kawaguchi · Yoshua Bengio -
2023 Poster: On the Identifiability and Estimation of Causal Location-Scale Noise Models »
Alexander Immer · Christoph Schultheiss · Julia Vogt · Bernhard Schölkopf · Peter Bühlmann · Alexander Marx -
2023 Poster: Can Neural Network Memorization Be Localized? »
Pratyush Maini · Michael Mozer · Hanie Sedghi · Zachary Lipton · Zico Kolter · Chiyuan Zhang -
2023 Poster: On Data Manifolds Entailed by Structural Causal Models »
Ricardo Dominguez-Olmedo · Amir-Hossein Karimi · Georgios Arvanitidis · Bernhard Schölkopf -
2023 Poster: The Hessian perspective into the Nature of Convolutional Neural Networks »
Sidak Pal Singh · Thomas Hofmann · Bernhard Schölkopf -
2023 Poster: Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels »
Alexander Immer · Tycho van der Ouderaa · Mark van der Wilk · Gunnar Ratsch · Bernhard Schölkopf -
2023 Poster: On the Relationship Between Explanation and Prediction: A Causal View »
Amir-Hossein Karimi · Krikamol Muandet · Simon Kornblith · Bernhard Schölkopf · Been Kim -
2023 Poster: Diffusion Based Representation Learning »
Sarthak Mittal · Korbinian Abstreiter · Stefan Bauer · Bernhard Schölkopf · Arash Mehrjou -
2023 Poster: Hyena Hierarchy: Towards Larger Convolutional Language Models »
Michael Poli · Stefano Massaroli · Eric Nguyen · Daniel Y Fu · Tri Dao · Stephen Baccus · Yoshua Bengio · Stefano Ermon · Christopher Re -
2023 Poster: Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning »
Sébastien Lachapelle · Tristan Deleu · Divyat Mahajan · Ioannis Mitliagkas · Yoshua Bengio · Simon Lacoste-Julien · Quentin Bertrand -
2023 Poster: Better Training of GFlowNets with Local Credit and Incomplete Trajectories »
Ling Pan · Nikolay Malkin · Dinghuai Zhang · Yoshua Bengio -
2023 Poster: Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation »
Jeffrey Willette · Seanie Lee · Bruno Andreis · Kenji Kawaguchi · Juho Lee · Sung Ju Hwang -
2023 Poster: How Does Information Bottleneck Help Deep Learning? »
Kenji Kawaguchi · Zhun Deng · Xu Ji · Jiaoyang Huang -
2023 Poster: Learning GFlowNets From Partial Episodes For Improved Convergence And Stability »
Kanika Madan · Jarrid Rector-Brooks · Maksym Korablyov · Emmanuel Bengio · Moksh Jain · Andrei-Cristian Nica · Tom Bosc · Yoshua Bengio · Nikolay Malkin -
2023 Oral: Interventional Causal Representation Learning »
Kartik Ahuja · Divyat Mahajan · Yixin Wang · Yoshua Bengio -
2023 Oral: Provably Learning Object-Centric Representations »
Jack Brady · Roland S. Zimmermann · Yash Sharma · Bernhard Schölkopf · Julius von Kügelgen · Wieland Brendel -
2023 Oral: Learning GFlowNets From Partial Episodes For Improved Convergence And Stability »
Kanika Madan · Jarrid Rector-Brooks · Maksym Korablyov · Emmanuel Bengio · Moksh Jain · Andrei-Cristian Nica · Tom Bosc · Yoshua Bengio · Nikolay Malkin -
2023 Poster: FAENet: Frame Averaging Equivariant GNN for Materials Modeling »
ALEXANDRE DUVAL · Victor Schmidt · Alex Hernandez-Garcia · Santiago Miret · Fragkiskos Malliaros · Yoshua Bengio · David Rolnick -
2023 Poster: Multi-Objective GFlowNets »
Moksh Jain · Sharath Chandra Raparthy · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Yoshua Bengio · Santiago Miret · Emmanuel Bengio -
2023 Poster: Estimation Beyond Data Reweighting: Kernel Method of Moments »
Heiner Kremer · Yassine Nemmour · Bernhard Schölkopf · Jia-Jie Zhu -
2023 Poster: Auxiliary Learning as an Asymmetric Bargaining Game »
Aviv Shamsian · Aviv Navon · Neta Glazer · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2023 Poster: Test-time Adaptation with Slot-Centric Models »
Mihir Prabhudesai · Anirudh Goyal · Sujoy Paul · Sjoerd van Steenkiste · Mehdi S. M. Sajjadi · Gaurav Aggarwal · Thomas Kipf · Deepak Pathak · Katerina Fragkiadaki -
2023 Poster: Interventional Causal Representation Learning »
Kartik Ahuja · Divyat Mahajan · Yixin Wang · Yoshua Bengio -
2023 Poster: Homomorphism AutoEncoder --- Learning Group Structured Representations from Observed Transitions »
Hamza Keurti · Hsiao-Ru Pan · Michel Besserve · Benjamin F. Grewe · Bernhard Schölkopf -
2023 Poster: A theory of continuous generative flow networks »
Salem Lahlou · Tristan Deleu · Pablo Lemos · Dinghuai Zhang · Alexandra Volokhova · Alex Hernandez-Garcia · Lena Nehale Ezzine · Yoshua Bengio · Nikolay Malkin -
2023 Poster: GFlowNet-EM for Learning Compositional Latent Variable Models »
Edward Hu · Nikolay Malkin · Moksh Jain · Katie Everett · Alexandros Graikos · Yoshua Bengio -
2022 Workshop: Hardware-aware efficient training (HAET) »
Gonçalo Mordido · Yoshua Bengio · Ghouthi BOUKLI HACENE · Vincent Gripon · François Leduc-Primeau · Vahid Partovi Nia · Julie Grollier -
2022 : Is a Modular Architecture Enough? »
Sarthak Mittal · Yoshua Bengio · Guillaume Lajoie -
2022 : Invited talks I, Q/A »
Bernhard Schölkopf · David Lopez-Paz -
2022 : Invited Talks 1, Bernhard Schölkopf and David Lopez-Paz »
Bernhard Schölkopf · David Lopez-Paz -
2022 Poster: Building Robust Ensembles via Margin Boosting »
Dinghuai Zhang · Hongyang Zhang · Aaron Courville · Yoshua Bengio · Pradeep Ravikumar · Arun Sai Suggala -
2022 Poster: Retrieval-Augmented Reinforcement Learning »
Anirudh Goyal · Abe Friesen Friesen · Andrea Banino · Theophane Weber · Nan Rosemary Ke · Adrià Puigdomenech Badia · Arthur Guez · Mehdi Mirza · Peter Humphreys · Ksenia Konyushkova · Michal Valko · Simon Osindero · Timothy Lillicrap · Nicolas Heess · Charles Blundell -
2022 Poster: Multi-scale Feature Learning Dynamics: Insights for Double Descent »
Mohammad Pezeshki · Amartya Mitra · Yoshua Bengio · Guillaume Lajoie -
2022 Spotlight: Retrieval-Augmented Reinforcement Learning »
Anirudh Goyal · Abe Friesen Friesen · Andrea Banino · Theophane Weber · Nan Rosemary Ke · Adrià Puigdomenech Badia · Arthur Guez · Mehdi Mirza · Peter Humphreys · Ksenia Konyushkova · Michal Valko · Simon Osindero · Timothy Lillicrap · Nicolas Heess · Charles Blundell -
2022 Spotlight: Building Robust Ensembles via Margin Boosting »
Dinghuai Zhang · Hongyang Zhang · Aaron Courville · Yoshua Bengio · Pradeep Ravikumar · Arun Sai Suggala -
2022 Spotlight: Multi-scale Feature Learning Dynamics: Insights for Double Descent »
Mohammad Pezeshki · Amartya Mitra · Yoshua Bengio · Guillaume Lajoie -
2022 Poster: When and How Mixup Improves Calibration »
Linjun Zhang · Zhun Deng · Kenji Kawaguchi · James Zou -
2022 Poster: Action-Sufficient State Representation Learning for Control with Structural Constraints »
Biwei Huang · Chaochao Lu · Liu Leqi · Jose Miguel Hernandez-Lobato · Clark Glymour · Bernhard Schölkopf · Kun Zhang -
2022 Poster: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Poster: Generalization and Robustness Implications in Object-Centric Learning »
Andrea Dittadi · Samuele Papa · Michele De Vita · Bernhard Schölkopf · Ole Winther · Francesco Locatello -
2022 Spotlight: Action-Sufficient State Representation Learning for Control with Structural Constraints »
Biwei Huang · Chaochao Lu · Liu Leqi · Jose Miguel Hernandez-Lobato · Clark Glymour · Bernhard Schölkopf · Kun Zhang -
2022 Spotlight: Generalization and Robustness Implications in Object-Centric Learning »
Andrea Dittadi · Samuele Papa · Michele De Vita · Bernhard Schölkopf · Ole Winther · Francesco Locatello -
2022 Spotlight: Biological Sequence Design with GFlowNets »
Moksh Jain · Emmanuel Bengio · Alex Hernandez-Garcia · Jarrid Rector-Brooks · Bonaventure Dossou · Chanakya Ekbote · Jie Fu · Tianyu Zhang · Michael Kilgour · Dinghuai Zhang · Lena Simine · Payel Das · Yoshua Bengio -
2022 Spotlight: When and How Mixup Improves Calibration »
Linjun Zhang · Zhun Deng · Kenji Kawaguchi · James Zou -
2022 Poster: Robustness Implies Generalization via Data-Dependent Generalization Bounds »
Kenji Kawaguchi · Zhun Deng · Kyle Luh · Jiaoyang Huang -
2022 Poster: Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning »
Utku Evci · Vincent Dumoulin · Hugo Larochelle · Michael Mozer -
2022 Poster: Multi-Task Learning as a Bargaining Game »
Aviv Navon · Aviv Shamsian · Idan Achituve · Haggai Maron · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2022 Poster: Generative Flow Networks for Discrete Probabilistic Modeling »
Dinghuai Zhang · Nikolay Malkin · Zhen Liu · Alexandra Volokhova · Aaron Courville · Yoshua Bengio -
2022 Poster: Causal Inference Through the Structural Causal Marginal Problem »
Luigi Gresele · Julius von Kügelgen · Jonas Kübler · Elke Kirschbaum · Bernhard Schölkopf · Dominik Janzing -
2022 Poster: Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions »
Heiner Kremer · Jia-Jie Zhu · Krikamol Muandet · Bernhard Schölkopf -
2022 Poster: On the Adversarial Robustness of Causal Algorithmic Recourse »
Ricardo Dominguez-Olmedo · Amir-Hossein Karimi · Bernhard Schölkopf -
2022 Poster: Towards Scaling Difference Target Propagation by Learning Backprop Targets »
Maxence ERNOULT · Fabrice Normandin · Abhinav Moudgil · Sean Spinney · Eugene Belilovsky · Irina Rish · Blake Richards · Yoshua Bengio -
2022 Spotlight: Towards Scaling Difference Target Propagation by Learning Backprop Targets »
Maxence ERNOULT · Fabrice Normandin · Abhinav Moudgil · Sean Spinney · Eugene Belilovsky · Irina Rish · Blake Richards · Yoshua Bengio -
2022 Spotlight: Functional Generalized Empirical Likelihood Estimation for Conditional Moment Restrictions »
Heiner Kremer · Jia-Jie Zhu · Krikamol Muandet · Bernhard Schölkopf -
2022 Spotlight: Causal Inference Through the Structural Causal Marginal Problem »
Luigi Gresele · Julius von Kügelgen · Jonas Kübler · Elke Kirschbaum · Bernhard Schölkopf · Dominik Janzing -
2022 Spotlight: Generative Flow Networks for Discrete Probabilistic Modeling »
Dinghuai Zhang · Nikolay Malkin · Zhen Liu · Alexandra Volokhova · Aaron Courville · Yoshua Bengio -
2022 Spotlight: On the Adversarial Robustness of Causal Algorithmic Recourse »
Ricardo Dominguez-Olmedo · Amir-Hossein Karimi · Bernhard Schölkopf -
2022 Oral: Robustness Implies Generalization via Data-Dependent Generalization Bounds »
Kenji Kawaguchi · Zhun Deng · Kyle Luh · Jiaoyang Huang -
2022 Oral: Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning »
Utku Evci · Vincent Dumoulin · Hugo Larochelle · Michael Mozer -
2022 Spotlight: Multi-Task Learning as a Bargaining Game »
Aviv Navon · Aviv Shamsian · Idan Achituve · Haggai Maron · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2021 Workshop: Tackling Climate Change with Machine Learning »
Hari Prasanna Das · Katarzyna Tokarska · Maria João Sousa · Meareg Hailemariam · David Rolnick · Xiaoxiang Zhu · Yoshua Bengio -
2021 Poster: Function Contrastive Learning of Transferable Meta-Representations »
Muhammad Waleed Gondal · Shruti Joshi · Nasim Rahaman · Stefan Bauer · Manuel Wuthrich · Bernhard Schölkopf -
2021 Spotlight: Function Contrastive Learning of Transferable Meta-Representations »
Muhammad Waleed Gondal · Shruti Joshi · Nasim Rahaman · Stefan Bauer · Manuel Wuthrich · Bernhard Schölkopf -
2021 Poster: On Disentangled Representations Learned from Correlated Data »
Frederik Träuble · Elliot Creager · Niki Kilbertus · Francesco Locatello · Andrea Dittadi · Anirudh Goyal · Bernhard Schölkopf · Stefan Bauer -
2021 Poster: Bayesian Quadrature on Riemannian Data Manifolds »
Christian Fröhlich · Alexandra Gessner · Philipp Hennig · Bernhard Schölkopf · Georgios Arvanitidis -
2021 Poster: Robust Representation Learning via Perceptual Similarity Metrics »
Saeid A Taghanaki · Kristy Choi · Amir Hosein Khasahmadi · Anirudh Goyal -
2021 Spotlight: Bayesian Quadrature on Riemannian Data Manifolds »
Christian Fröhlich · Alexandra Gessner · Philipp Hennig · Bernhard Schölkopf · Georgios Arvanitidis -
2021 Oral: On Disentangled Representations Learned from Correlated Data »
Frederik Träuble · Elliot Creager · Niki Kilbertus · Francesco Locatello · Andrea Dittadi · Anirudh Goyal · Bernhard Schölkopf · Stefan Bauer -
2021 Spotlight: Robust Representation Learning via Perceptual Similarity Metrics »
Saeid A Taghanaki · Kristy Choi · Amir Hosein Khasahmadi · Anirudh Goyal -
2021 Poster: An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming »
Minkai Xu · Wujie Wang · Shitong Luo · Chence Shi · Yoshua Bengio · Rafael Gomez-Bombarelli · Jian Tang -
2021 Spotlight: An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming »
Minkai Xu · Wujie Wang · Shitong Luo · Chence Shi · Yoshua Bengio · Rafael Gomez-Bombarelli · Jian Tang -
2021 Poster: Necessary and sufficient conditions for causal feature selection in time series with latent common causes »
Atalanti Mastakouri · Bernhard Schölkopf · Dominik Janzing -
2021 Poster: Conditional Distributional Treatment Effect with Kernel Conditional Mean Embeddings and U-Statistic Regression »
Junhyung Park · Uri Shalit · Bernhard Schölkopf · Krikamol Muandet -
2021 Spotlight: Necessary and sufficient conditions for causal feature selection in time series with latent common causes »
Atalanti Mastakouri · Bernhard Schölkopf · Dominik Janzing -
2021 Spotlight: Conditional Distributional Treatment Effect with Kernel Conditional Mean Embeddings and U-Statistic Regression »
Junhyung Park · Uri Shalit · Bernhard Schölkopf · Krikamol Muandet -
2021 Poster: Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth »
Keyulu Xu · Mozhi Zhang · Stefanie Jegelka · Kenji Kawaguchi -
2021 Poster: Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning »
Sumedh Sontakke · Arash Mehrjou · Laurent Itti · Bernhard Schölkopf -
2021 Spotlight: Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth »
Keyulu Xu · Mozhi Zhang · Stefanie Jegelka · Kenji Kawaguchi -
2021 Spotlight: Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning »
Sumedh Sontakke · Arash Mehrjou · Laurent Itti · Bernhard Schölkopf -
2020 : QA for invited talk 4 Bengio »
Yoshua Bengio -
2020 : Invited talk 4 Bengio »
Yoshua Bengio -
2020 : Keynote: Yoshua Bengio (Q&A) »
Yoshua Bengio -
2020 : Keynote: Yoshua Bengio »
Yoshua Bengio -
2020 Workshop: Inductive Biases, Invariances and Generalization in Reinforcement Learning »
Anirudh Goyal · Rosemary Nan Ke · Jane Wang · Stefan Bauer · Theophane Weber · Fabio Viola · Bernhard Schölkopf · Stefan Bauer -
2020 Workshop: Object-Oriented Learning: Perception, Representation, and Reasoning »
Sungjin Ahn · Adam Kosiorek · Jessica Hamrick · Sjoerd van Steenkiste · Yoshua Bengio -
2020 Workshop: MLRetrospectives: A Venue for Self-Reflection in ML Research »
Jessica Forde · Jesse Dodge · Mayoore Jaiswal · Rosanne Liu · Ryan Lowe · Rosanne Liu · Joelle Pineau · Yoshua Bengio -
2020 Poster: Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules »
Sarthak Mittal · Alex Lamb · Anirudh Goyal · Vikram Voleti · Murray Shanahan · Guillaume Lajoie · Michael Mozer · Yoshua Bengio -
2020 Poster: Learning to Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning »
Sai Krishna Gottipati · Boris Sattarov · Sufeng Niu · Yashaswi Pathak · Haoran Wei · Shengchao Liu · Shengchao Liu · Simon Blackburn · Karam Thomas · Connor Coley · Jian Tang · Sarath Chandar · Yoshua Bengio -
2020 Poster: Perceptual Generative Autoencoders »
Zijun Zhang · Ruixiang ZHANG · Zongpeng Li · Yoshua Bengio · Liam Paull -
2020 Poster: Revisiting Fundamentals of Experience Replay »
William Fedus · Prajit Ramachandran · Rishabh Agarwal · Yoshua Bengio · Hugo Larochelle · Mark Rowland · Will Dabney -
2020 Poster: Small-GAN: Speeding up GAN Training using Core-Sets »
Samrath Sinha · Han Zhang · Anirudh Goyal · Yoshua Bengio · Hugo Larochelle · Augustus Odena -
2020 Poster: Weakly-Supervised Disentanglement Without Compromises »
Francesco Locatello · Ben Poole · Gunnar Ratsch · Bernhard Schölkopf · Olivier Bachem · Michael Tschannen -
2019 : AI Commons »
Yoshua Bengio -
2019 : Opening remarks »
Yoshua Bengio -
2019 Workshop: AI For Social Good (AISG) »
Margaux Luck · Kris Sankaran · Tristan Sylvain · Sean McGregor · Jonnie Penn · Girmaw Abebe Tadesse · Virgile Sylvain · Myriam Côté · Lester Mackey · Rayid Ghani · Yoshua Bengio -
2019 : Panel Discussion »
Yoshua Bengio · Andrew Ng · Raia Hadsell · John Platt · Claire Monteleoni · Jennifer Chayes -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Personalized Visualization of the Impact of Climate Change »
Yoshua Bengio -
2019 : Networking Lunch (provided) + Poster Session »
Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki -
2019 Workshop: Climate Change: How Can AI Help? »
David Rolnick · Alexandre Lacoste · Tegan Maharaj · Jennifer Chayes · Yoshua Bengio -
2019 Poster: State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations »
Alex Lamb · Jonathan Binas · Anirudh Goyal · Sandeep Subramanian · Ioannis Mitliagkas · Yoshua Bengio · Michael Mozer -
2019 Poster: Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness »
Raphael Suter · Djordje Miladinovic · Bernhard Schölkopf · Stefan Bauer -
2019 Poster: On the Spectral Bias of Neural Networks »
Nasim Rahaman · Aristide Baratin · Devansh Arpit · Felix Draxler · Min Lin · Fred Hamprecht · Yoshua Bengio · Aaron Courville -
2019 Oral: Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness »
Raphael Suter · Djordje Miladinovic · Bernhard Schölkopf · Stefan Bauer -
2019 Oral: On the Spectral Bias of Neural Networks »
Nasim Rahaman · Aristide Baratin · Devansh Arpit · Felix Draxler · Min Lin · Fred Hamprecht · Yoshua Bengio · Aaron Courville -
2019 Oral: State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations »
Alex Lamb · Jonathan Binas · Anirudh Goyal · Sandeep Subramanian · Ioannis Mitliagkas · Yoshua Bengio · Michael Mozer -
2019 Poster: Kernel Mean Matching for Content Addressability of GANs »
Wittawat Jitkrittum · Wittawat Jitkrittum · Patsorn Sangkloy · Muhammad Waleed Gondal · Amit Raj · James Hays · Bernhard Schölkopf -
2019 Oral: Kernel Mean Matching for Content Addressability of GANs »
Wittawat Jitkrittum · Wittawat Jitkrittum · Patsorn Sangkloy · Patsorn Sangkloy · Muhammad Waleed Gondal · Muhammad Waleed Gondal · Amit Raj · Amit Raj · James Hays · James Hays · Bernhard Schölkopf · Bernhard Schölkopf -
2019 Poster: Manifold Mixup: Better Representations by Interpolating Hidden States »
Vikas Verma · Alex Lamb · Christopher Beckham · Amir Najafi · Ioannis Mitliagkas · David Lopez-Paz · Yoshua Bengio -
2019 Poster: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2019 Poster: Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations »
Francesco Locatello · Stefan Bauer · Mario Lucic · Gunnar Ratsch · Sylvain Gelly · Bernhard Schölkopf · Olivier Bachem -
2019 Poster: GMNN: Graph Markov Neural Networks »
Meng Qu · Yoshua Bengio · Jian Tang -
2019 Oral: GMNN: Graph Markov Neural Networks »
Meng Qu · Yoshua Bengio · Jian Tang -
2019 Oral: Manifold Mixup: Better Representations by Interpolating Hidden States »
Vikas Verma · Alex Lamb · Christopher Beckham · Amir Najafi · Ioannis Mitliagkas · David Lopez-Paz · Yoshua Bengio -
2019 Oral: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2019 Oral: Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations »
Francesco Locatello · Stefan Bauer · Mario Lucic · Gunnar Ratsch · Sylvain Gelly · Bernhard Schölkopf · Olivier Bachem -
2018 Poster: Detecting non-causal artifacts in multivariate linear regression models »
Dominik Janzing · Bernhard Schölkopf -
2018 Poster: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Poster: Mutual Information Neural Estimation »
Mohamed Belghazi · Aristide Baratin · Sai Rajeswar · Sherjil Ozair · Yoshua Bengio · R Devon Hjelm · Aaron Courville -
2018 Oral: Detecting non-causal artifacts in multivariate linear regression models »
Dominik Janzing · Bernhard Schölkopf -
2018 Oral: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: Mutual Information Neural Estimation »
Mohamed Belghazi · Aristide Baratin · Sai Rajeswar · Sherjil Ozair · Yoshua Bengio · R Devon Hjelm · Aaron Courville -
2018 Poster: Tempered Adversarial Networks »
Mehdi S. M. Sajjadi · Giambattista Parascandolo · Arash Mehrjou · Bernhard Schölkopf -
2018 Poster: Differentially Private Database Release via Kernel Mean Embeddings »
Matej Balog · Ilya Tolstikhin · Bernhard Schölkopf -
2018 Poster: Focused Hierarchical RNNs for Conditional Sequence Processing »
Rosemary Nan Ke · Konrad Zolna · Alessandro Sordoni · Zhouhan Lin · Adam Trischler · Yoshua Bengio · Joelle Pineau · Laurent Charlin · Christopher Pal -
2018 Oral: Differentially Private Database Release via Kernel Mean Embeddings »
Matej Balog · Ilya Tolstikhin · Bernhard Schölkopf -
2018 Oral: Tempered Adversarial Networks »
Mehdi S. M. Sajjadi · Giambattista Parascandolo · Arash Mehrjou · Bernhard Schölkopf -
2018 Oral: Focused Hierarchical RNNs for Conditional Sequence Processing »
Rosemary Nan Ke · Konrad Zolna · Alessandro Sordoni · Zhouhan Lin · Adam Trischler · Yoshua Bengio · Joelle Pineau · Laurent Charlin · Christopher Pal -
2018 Poster: Learning Independent Causal Mechanisms »
Giambattista Parascandolo · Niki Kilbertus · Mateo Rojas-Carulla · Bernhard Schölkopf -
2018 Oral: Learning Independent Causal Mechanisms »
Giambattista Parascandolo · Niki Kilbertus · Mateo Rojas-Carulla · Bernhard Schölkopf -
2017 Workshop: Reproducibility in Machine Learning Research »
Rosemary Nan Ke · Anirudh Goyal · Alex Lamb · Joelle Pineau · Samy Bengio · Yoshua Bengio -
2017 Poster: Sharp Minima Can Generalize For Deep Nets »
Laurent Dinh · Razvan Pascanu · Samy Bengio · Yoshua Bengio -
2017 Poster: A Closer Look at Memorization in Deep Networks »
David Krueger · Yoshua Bengio · Stanislaw Jastrzebski · Maxinder S. Kanwal · Nicolas Ballas · Asja Fischer · Emmanuel Bengio · Devansh Arpit · Tegan Maharaj · Aaron Courville · Simon Lacoste-Julien -
2017 Talk: A Closer Look at Memorization in Deep Networks »
David Krueger · Yoshua Bengio · Stanislaw Jastrzebski · Maxinder S. Kanwal · Nicolas Ballas · Asja Fischer · Emmanuel Bengio · Devansh Arpit · Tegan Maharaj · Aaron Courville · Simon Lacoste-Julien -
2017 Talk: Sharp Minima Can Generalize For Deep Nets »
Laurent Dinh · Razvan Pascanu · Samy Bengio · Yoshua Bengio -
2017 Invited Talk: Causal Learning »
Bernhard Schölkopf