Timezone: »
Learning shortcuts, such as relying on spurious correlations or memorizing specific examples, make achieving robust machine learning difficult. invariant learning methods such as GroupDRO, capable of learning from various training groups, are shown to be effective for obtaining more robust models. However, the high cost of annotating data with environmental labels limits the practicality of these algorithms. This work introduces a framework called cross-risk minimization (CRM), which automatically groups examples based on their level of difficulty level. As an extension of the widely-used cross-validation routine, CRM uses the mistakes made by a model on held-out data as a signal to identify challenging examples. By leveraging these mistakes, CRM can effectively label both training and validation examples into groups with different levels of difficulty. We provide experiments on the Waterbirds dataset set, a well-known out-of-distribution (OOD) benchmark to demonstrate the effectiveness of CRM in inferring reliable group labels. These group labels are then used by other invariant learning methods to improve the worst-group accuracy.
Author Information
Mohammad Pezeshki (Meta (FAIR))
Diane Bouchacourt (Meta)
Mark Ibrahim (Fundamental AI Research (FAIR), Meta AI)
Nicolas Ballas (Université de Montréal)
Pascal Vincent (University of Montreal)
David Lopez-Paz (Facebook AI Research)
More from the Same Authors
-
2022 : BARACK: Partially Supervised Group Robustness With Guarantees »
Nimit Sohoni · Maziar Sanjabi · Nicolas Ballas · Aditya Grover · Shaoliang Nie · Hamed Firooz · Christopher Re -
2023 : Understanding the Detrimental Class-level Effects of Data Augmentation »
Polina Kirichenko · Mark Ibrahim · Randall Balestriero · Diane Bouchacourt · Ramakrishna Vedantam · Hamed Firooz · Andrew Wilson -
2023 : Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection »
Vitória Barin-Pacela · Kartik Ahuja · Simon Lacoste-Julien · Pascal Vincent -
2023 : Does Progress On Object Recognition Benchmarks Improve Real-World Generalization? »
Megan Richards · Diane Bouchacourt · Mark Ibrahim · Polina Kirichenko -
2023 : Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection »
Vitória Barin-Pacela · Kartik Ahuja · Simon Lacoste-Julien · Pascal Vincent -
2023 : A Closer Look at In-Context Learning under Distribution Shifts »
Kartik Ahuja · David Lopez-Paz -
2023 Poster: Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization »
Alexandre Rame · Kartik Ahuja · Jianyu Zhang · Matthieu Cord · Leon Bottou · David Lopez-Paz -
2023 Oral: Why does Throwing Away Data Improve Worst-Group Error? »
Kamalika Chaudhuri · Kartik Ahuja · Martin Arjovsky · David Lopez-Paz -
2023 Poster: Why does Throwing Away Data Improve Worst-Group Error? »
Kamalika Chaudhuri · Kartik Ahuja · Martin Arjovsky · David Lopez-Paz -
2023 : Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection »
Vitória Barin-Pacela · Kartik Ahuja · Simon Lacoste-Julien · Pascal Vincent -
2022 : Invited talks I, Q/A »
Bernhard Schölkopf · David Lopez-Paz -
2022 : Invited Talks 1, Bernhard Schölkopf and David Lopez-Paz »
Bernhard Schölkopf · David Lopez-Paz -
2022 Poster: Rich Feature Construction for the Optimization-Generalization Dilemma »
Jianyu Zhang · David Lopez-Paz · Léon Bottou -
2022 Spotlight: Rich Feature Construction for the Optimization-Generalization Dilemma »
Jianyu Zhang · David Lopez-Paz · Léon Bottou -
2020 Workshop: Workshop on Continual Learning »
Haytham Fayek · Arslan Chaudhry · David Lopez-Paz · Eugene Belilovsky · Jonathan Richard Schwarz · Marc Pickett · Rahaf Aljundi · Sayna Ebrahimi · Razvan Pascanu · Puneet Dokania -
2020 Poster: Entropy Minimization In Emergent Languages »
Eugene Kharitonov · Rahma Chaabouni · Diane Bouchacourt · Marco Baroni -
2019 Poster: Manifold Mixup: Better Representations by Interpolating Hidden States »
Vikas Verma · Alex Lamb · Christopher Beckham · Amir Najafi · Ioannis Mitliagkas · David Lopez-Paz · Yoshua Bengio -
2019 Poster: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2019 Oral: Manifold Mixup: Better Representations by Interpolating Hidden States »
Vikas Verma · Alex Lamb · Christopher Beckham · Amir Najafi · Ioannis Mitliagkas · David Lopez-Paz · Yoshua Bengio -
2019 Oral: First-Order Adversarial Vulnerability of Neural Networks and Input Dimension »
Carl-Johann Simon-Gabriel · Yann Ollivier · Leon Bottou · Bernhard Schölkopf · David Lopez-Paz -
2018 Poster: Optimizing the Latent Space of Generative Networks »
Piotr Bojanowski · Armand Joulin · David Lopez-Paz · Arthur Szlam -
2018 Oral: Optimizing the Latent Space of Generative Networks »
Piotr Bojanowski · Armand Joulin · David Lopez-Paz · Arthur Szlam -
2017 Poster: A Closer Look at Memorization in Deep Networks »
David Krueger · Yoshua Bengio · Stanislaw Jastrzebski · Maxinder S. Kanwal · Nicolas Ballas · Asja Fischer · Emmanuel Bengio · Devansh Arpit · Tegan Maharaj · Aaron Courville · Simon Lacoste-Julien -
2017 Talk: A Closer Look at Memorization in Deep Networks »
David Krueger · Yoshua Bengio · Stanislaw Jastrzebski · Maxinder S. Kanwal · Nicolas Ballas · Asja Fischer · Emmanuel Bengio · Devansh Arpit · Tegan Maharaj · Aaron Courville · Simon Lacoste-Julien