ICML Learning In-Context Decision Making with Synthetic MDPs

Poster
in
Workshop: Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs

Learning In-Context Decision Making with Synthetic MDPs

Akarsh Kumar · Christopher Lu · Louis Kirsch · Phillip Isola

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Current AI models are trained on huge datasets of real world data.This is increasingly true in RL, with generalist agents being trained on data from hundreds of real environments.It is thought that real data/environments are the only way to capture the intricate complexities of real world RL tasks.In this paper, we challenge this notion by training generalist in-context decision making agents on only data generated by simple random processes.We investigate data generated from eight different families of synthetic environments ranging from Markov chains and bandits to discrete, continuous, and hybrid Markov decision processes (MDPs).Surprisingly, the resulting agents' performances are comparable to agents trained on real environment data.We additionally analyze what properties of the pretraining MDPs are ideal for creating good agents, thus giving RL practitioners insights on choosing which environments to train on.

Chat is not available.

Poster in Workshop: Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs

Learning In-Context Decision Making with Synthetic MDPs

Akarsh Kumar · Christopher Lu · Louis Kirsch · Phillip Isola

Poster
in
Workshop: Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs