Poster
in
Workshop: New Frontiers in Learning, Control, and Dynamical Systems
In-Context Decision-Making from Supervised Pretraining
Jonathan Lee · Annie Xie · Aldo Pacchiano · Yash Chandak · Chelsea Finn · Ofir Nachum · Emma Brunskill
Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context learning capabilities of transformers in decision-making problems, i.e., bandits and Markov decision processes. To do so, we introduce and study a supervised pretraining method where the transformer predicts an optimal action given a query state and an in-context dataset of interactions, across a diverse set of tasks. This procedure, while simple, produces an in-context algorithm with several surprising capabilities. We observe that the pretrained transformer can be used to solve a range of decision-making problems, exhibiting both exploration online and conservatism offline, despite not being explicitly trained to do so. It also generalizes beyond the pretraining distribution to new tasks and automatically adapts its decision-making strategies to unknown structure. Theoretically, we show the pretrained transformer can be viewed as an implementation of posterior sampling. We further leverage this connection to provide guarantees on its regret, and prove that it can learn a decision-making algorithm stronger than a source algorithm used to generate its pretraining data. These results suggest a promising yet simple path towards instilling strong in-context decision-making abilities in transformers.