Timezone: »
Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains surprisingly unexplored. In this paper, we first undertake a systematic empirical investigation of this combination, finding (i) that in domain adaptation settings, self-training and contrastive learning offer significant complementary gains; and (ii) that in semi-supervised learning settings, surprisingly, the benefits are not synergistic. Across eight distribution shift datasets (e.g., BREEDs, WILDS), we demonstrate that the combined method obtains 3--8% higher accuracy than either approach independently. Finally, we theoretically analyze these techniques in a simplified model of distribution shift demonstrating scenarios under which the features produced by contrastive learning can yield a good initialization for self-training to further amplify gains and achieve optimal performance, even when either method alone would fail.
Author Information
Saurabh Garg (Carnegie Mellon University)
Amrith Setlur (Carnegie Mellon University)
Zachary Lipton (CMU & Abridge)
Sivaraman Balakrishnan (Carnegie Mellon University)
Virginia Smith (Carnegie Mellon University)

Virginia Smith is an assistant professor in the Machine Learning Department at Carnegie Mellon University, and a courtesy faculty member in the Electrical and Computer Engineering Department. Her research interests span machine learning, optimization, and distributed systems. Prior to CMU, Virginia was a postdoc at Stanford University, received a Ph.D. in Computer Science from UC Berkeley, and obtained undergraduate degrees in Mathematics and Computer Science from the University of Virginia.
Aditi Raghunathan (Carnegie Mellon University)
More from the Same Authors
-
2021 : Private Multi-Task Learning: Formulation and Applications to Federated Learning »
Shengyuan Hu · Steven Wu · Virginia Smith -
2021 : Do You See What I See? A Comparison of Radiologist Eye Gaze to Computer Vision Saliency Maps for Chest X-ray Classification »
Jesse Kim · Helen Zhou · Zachary Lipton -
2022 : Domain Adaptation under Open Set Label Shift »
Saurabh Garg · Sivaraman Balakrishnan · Zachary Lipton -
2022 : Unsupervised Learning under Latent Label Shift »
Pranav Mani · Manley Roberts · Saurabh Garg · Zachary Lipton -
2022 : Characterizing Datapoints via Second-Split Forgetting »
Pratyush Maini · Saurabh Garg · Zachary Lipton · Zico Kolter -
2022 : Counterfactual Metrics for Auditing Black-Box Recommender Systems for Ethical Concerns »
Nil-Jana Akpinar · Liu Leqi · Dylan Hadfield-Menell · Zachary Lipton -
2022 : RiskyZoo: A Library for Risk-Sensitive Supervised Learning »
William Wong · Audrey Huang · Liu Leqi · Kamyar Azizzadenesheli · Zachary Lipton -
2023 : Model-tuning Via Prompts Makes NLP Models Adversarially Robust »
Mrigank Raman · Pratyush Maini · Zico Kolter · Zachary Lipton · Danish Pruthi -
2023 : (Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy »
Elan Rosenfeld · Saurabh Garg -
2023 : Why is SAM Robust to Label Noise? »
Christina Baek · Zico Kolter · Aditi Raghunathan -
2023 : Sharpness-Aware Minimization Enhances Feature Diversity »
Jacob Mitchell Springer · Vaishnavh Nagarajan · Aditi Raghunathan -
2023 : Deep Neural Networks Extrapolate Cautiously (Most of the Time) »
Katie Kang · Amrith Setlur · Claire Tomlin · Sergey Levine -
2023 : (Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy »
Elan Rosenfeld · Saurabh Garg -
2023 : Deep Equilibrium Based Neural Operators for Steady-State PDEs »
Tanya Marwah · Ashwini Pokle · Zico Kolter · Zachary Lipton · Jianfeng Lu · Andrej Risteski -
2023 : How to Cope with Gradual Data Drift? »
Rasool Fakoor · Jonas Mueller · Zachary Lipton · Pratik Chaudhari · Alex Smola -
2023 : TMARS: Improving Visual Representations by Circumventing Text Feature Learning »
Pratyush Maini · Sachin Goyal · Zachary Lipton · Zico Kolter · Aditi Raghunathan -
2023 : Identifying Inequity in Treatment Allocation »
Yewon Byun · Dylan Sam · Zachary Lipton · Bryan Wilder -
2023 : Progressive Knowledge Distillation: Balancing Inference Latency and Accuracy at Runtime »
Don Kurian Dennis · Abhishek Shetty · Anish Sevekari · Kazuhito Koishida · Virginia Smith -
2023 : Conditional Diffusion Replay for Continual Learning in Medical Settings »
Yewon Byun · Saurabh Garg · Sanket Vaibhav Mehta · Praveer Singh · Jayashree Kalpathy-cramer · Bryan Wilder · Zachary Lipton -
2023 : SCIS 2023 Panel, The Future of Generalization: Scale, Safety and Beyond »
Maggie Makar · Samuel Bowman · Zachary Lipton · Adam Gleave -
2023 : Prompt-based Generative Replay: A Text-to-Image Approach for Continual Learning in Medical Settings »
Yewon Byun · Saurabh Garg · Sanket Vaibhav Mehta · Jayashree Kalpathy-Cramer · Praveer Singh · Bryan Wilder · Zachary Lipton -
2023 : (Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy »
Elan Rosenfeld · Saurabh Garg -
2023 : Aditi Raghunathan »
Aditi Raghunathan -
2023 Poster: Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective »
Tanya Marwah · Zachary Lipton · Jianfeng Lu · Andrej Risteski -
2023 Poster: Can Neural Network Memorization Be Localized? »
Pratyush Maini · Michael Mozer · Hanie Sedghi · Zachary Lipton · Zico Kolter · Chiyuan Zhang -
2023 Poster: Contextual Reliability: When Different Features Matter in Different Contexts »
Gaurav Ghosal · Amrith Setlur · Daniel S Brown · Anca Dragan · Aditi Raghunathan -
2023 Poster: RLSbench: Domain Adaptation Under Relaxed Label Shift »
Saurabh Garg · Nick Erickson · University of California James Sharpnack · Alex Smola · Sivaraman Balakrishnan · Zachary Lipton -
2023 Poster: Automatically Auditing Large Language Models via Discrete Optimization »
Erik Jones · Anca Dragan · Aditi Raghunathan · Jacob Steinhardt -
2023 Poster: CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets »
Zachary Novack · Julian McAuley · Zachary Lipton · Saurabh Garg -
2022 Workshop: Principles of Distribution Shift (PODS) »
Elan Rosenfeld · Saurabh Garg · Shibani Santurkar · Jamie Morgenstern · Hossein Mobahi · Zachary Lipton · Andrej Risteski -
2022 Poster: Supervised Learning with General Risk Functionals »
Liu Leqi · Audrey Huang · Zachary Lipton · Kamyar Azizzadenesheli -
2022 Poster: Private Adaptive Optimization with Side information »
Tian Li · Manzil Zaheer · Sashank Jakkam Reddi · Virginia Smith -
2022 Spotlight: Private Adaptive Optimization with Side information »
Tian Li · Manzil Zaheer · Sashank Jakkam Reddi · Virginia Smith -
2022 Spotlight: Supervised Learning with General Risk Functionals »
Liu Leqi · Audrey Huang · Zachary Lipton · Kamyar Azizzadenesheli -
2021 : RL Explainability & Interpretability Panel »
Ofra Amir · Finale Doshi-Velez · Alan Fern · Zachary Lipton · Omer Gottesman · Niranjani Prasad -
2021 Poster: Correcting Exposure Bias for Link Recommendation »
Shantanu Gupta · Hao Wang · Zachary Lipton · Yuyang Wang -
2021 Spotlight: Correcting Exposure Bias for Link Recommendation »
Shantanu Gupta · Hao Wang · Zachary Lipton · Yuyang Wang -
2021 Poster: RATT: Leveraging Unlabeled Data to Guarantee Generalization »
Saurabh Garg · Sivaraman Balakrishnan · Zico Kolter · Zachary Lipton -
2021 Oral: RATT: Leveraging Unlabeled Data to Guarantee Generalization »
Saurabh Garg · Sivaraman Balakrishnan · Zico Kolter · Zachary Lipton -
2021 Poster: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2021 Poster: Heterogeneity for the Win: One-Shot Federated Clustering »
Don Kurian Dennis · Tian Li · Virginia Smith -
2021 Poster: Ditto: Fair and Robust Federated Learning Through Personalization »
Tian Li · Shengyuan Hu · Ahmad Beirami · Virginia Smith -
2021 Spotlight: Ditto: Fair and Robust Federated Learning Through Personalization »
Tian Li · Shengyuan Hu · Ahmad Beirami · Virginia Smith -
2021 Spotlight: On Proximal Policy Optimization's Heavy-tailed Gradients »
Saurabh Garg · Joshua Zhanson · Emilio Parisotto · Adarsh Prasad · Zico Kolter · Zachary Lipton · Sivaraman Balakrishnan · Ruslan Salakhutdinov · Pradeep Ravikumar -
2021 Spotlight: Heterogeneity for the Win: One-Shot Federated Clustering »
Don Kurian Dennis · Tian Li · Virginia Smith -
2020 : Contributed Talk 3: A Unified View of Label Shift Estimation »
Saurabh Garg -
2020 Poster: Uncertainty-Aware Lookahead Factor Models for Quantitative Investing »
Lakshay Chauhan · John Alberg · Zachary Lipton -
2019 Poster: Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment »
Yifan Wu · Ezra Winston · Divyansh Kaushik · Zachary Lipton -
2019 Poster: What is the Effect of Importance Weighting in Deep Learning? »
Jonathon Byrd · Zachary Lipton -
2019 Oral: Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment »
Yifan Wu · Ezra Winston · Divyansh Kaushik · Zachary Lipton -
2019 Oral: What is the Effect of Importance Weighting in Deep Learning? »
Jonathon Byrd · Zachary Lipton -
2019 Poster: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re -
2019 Oral: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re -
2018 Poster: Detecting and Correcting for Label Shift with Black Box Predictors »
Zachary Lipton · Yu-Xiang Wang · Alexander Smola -
2018 Poster: Born Again Neural Networks »
Tommaso Furlanello · Zachary Lipton · Michael Tschannen · Laurent Itti · Anima Anandkumar -
2018 Oral: Born Again Neural Networks »
Tommaso Furlanello · Zachary Lipton · Michael Tschannen · Laurent Itti · Anima Anandkumar -
2018 Oral: Detecting and Correcting for Label Shift with Black Box Predictors »
Zachary Lipton · Yu-Xiang Wang · Alexander Smola -
2018 Poster: Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information »
Yichong Xu · Hariank Muthakana · Sivaraman Balakrishnan · Aarti Singh · Artur Dubrawski -
2018 Oral: Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information »
Yichong Xu · Hariank Muthakana · Sivaraman Balakrishnan · Aarti Singh · Artur Dubrawski