Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Sampling and Optimization in Discrete Space

Understanding prompt engineering does not require rethinking generalization

Victor Akinwande · Yiding Jiang · Dylan Sam · Zico Kolter


Abstract:

Zero-shot learning in prompted visual-language models, the practice of crafting prompts to build classifiers without an explicit training process, shows an impressive performance in many settings. There also emerges a seemingly surprising fact: this method suffers relatively little from overfitting; i.e., when a prompt is manually engineered to achieve low error on a given training set (thus rendering the method no longer zero-shot), the approach still performs relatively well on held-out test data. In this paper, we show that we can explain such performance remarkably well via recourse to classical PAC-Bayes bounds. Specifically, we show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are \emph{remarkably} tight by the standards of the literature: for instance, the generalization bound of an ImageNet classifier is often within a few percentage points of the true test error. Indeed, we show that we can therefore \emph{greedily} search over the prompt space in such a framework, improving upon training performance while retaining the same bound. Furthermore, the bound is remarkably suitable for model selection: the models with the best bound typically also have the best test performance. This work thus provides a substantial justification for the widespread use of ``prompt engineering,'' even if it seems as though such methods could overfit a training set.

Chat is not available.