Timezone: »
We demonstrate that co-training (Blum & Mitchell, 1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup. We find that co-training makes it possible to improve the original prompt model and at the same time learn a smaller, downstream task-specific model. In the case where we only have partial access to a prompt model (e.g., output probabilities from GPT-3 (Brown et al., 2020)) we learn a calibration model over the prompt outputs. When we have full access to the prompt model's gradients but full finetuning remains prohibitively expensive (e.g., T0 (Sanh et al., 2021)), we learn a set of soft prompt continuous vectors to iteratively update the prompt model. We find that models trained in this manner can significantly improve performance on challenging datasets where there is currently a large gap between prompt-based learning and fully-supervised models.
Author Information
Hunter Lang (MIT)
Monica Agrawal (Massachusetts Institute of Technology)
Yoon Kim (Harvard University)
David Sontag (Massachusetts Institute of Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Co-training Improves Prompt-based Learning for Large Language Models »
Tue. Jul 19th through Wed the 20th Room Hall E #124
More from the Same Authors
-
2022 : Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Michael Oberst · Nikolaj Thams · David Sontag -
2022 : Evaluating Robustness to Dataset Shift via Parametric Robustness Sets »
Nikolaj Thams · Michael Oberst · David Sontag -
2022 Poster: Sample Efficient Learning of Predictors that Complement Humans »
Mohammad-Amin Charusaie · Hussein Mozannar · David Sontag · Samira Samadi -
2022 Spotlight: Sample Efficient Learning of Predictors that Complement Humans »
Mohammad-Amin Charusaie · Hussein Mozannar · David Sontag · Samira Samadi -
2021 Poster: Neural Pharmacodynamic State Space Modeling »
Zeshan Hussain · Rahul G. Krishnan · David Sontag -
2021 Poster: Regularizing towards Causal Invariance: Linear Models with Proxies »
Michael Oberst · Nikolaj Thams · Jonas Peters · David Sontag -
2021 Poster: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2021 Spotlight: Regularizing towards Causal Invariance: Linear Models with Proxies »
Michael Oberst · Nikolaj Thams · Jonas Peters · David Sontag -
2021 Oral: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2021 Spotlight: Neural Pharmacodynamic State Space Modeling »
Zeshan Hussain · Rahul G. Krishnan · David Sontag -
2020 Poster: Emergence of Separable Manifolds in Deep Language Representations »
Jonathan Mamou · Hang Le · Miguel A del Rio Fernandez · Cory Stephenson · Hanlin Tang · Yoon Kim · SueYeon Chung -
2020 Poster: Estimation of Bounds on Potential Outcomes For Decision Making »
Maggie Makar · Fredrik Johansson · John Guttag · David Sontag -
2020 Poster: Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models »
Rares-Darius Buhai · Yoni Halpern · Yoon Kim · Andrej Risteski · David Sontag -
2020 Poster: Consistent Estimators for Learning to Defer to an Expert »
Hussein Mozannar · David Sontag -
2019 Poster: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2019 Oral: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2018 Poster: Semi-Amortized Variational Autoencoders »
Yoon Kim · Sam Wiseman · Andrew Miller · David Sontag · Alexander Rush -
2018 Poster: Adversarially Regularized Autoencoders »
Jake Zhao · Yoon Kim · Kelly Zhang · Alexander Rush · Yann LeCun -
2018 Oral: Semi-Amortized Variational Autoencoders »
Yoon Kim · Sam Wiseman · Andrew Miller · David Sontag · Alexander Rush -
2018 Oral: Adversarially Regularized Autoencoders »
Jake Zhao · Yoon Kim · Kelly Zhang · Alexander Rush · Yann LeCun -
2017 Poster: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag -
2017 Talk: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag -
2017 Poster: Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation »
Yacine Jernite · Anna Choromanska · David Sontag -
2017 Talk: Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation »
Yacine Jernite · Anna Choromanska · David Sontag