Universal Approximation of Mean-Field Models via Transformers
Abstract
Lay Summary
This work shows that transformer networks—the same models behind today’s language AIs—can learn to predict how large groups of identical “particles” (like birds in a flock, robots in a swarm, or neurons in a simple neural net) move together. Instead of tracking each particle, scientists often use “mean-field” equations describing the crowd’s overall behavior. Because transformers naturally handle many inputs without regard to order, they’re ideal for these indistinguishable-agent systems.The authors train transformers on two classic examples—the Cucker–Smale flocking model and a mean-field view of two-layer neural-network training—and find excellent agreement with simulated data. They then prove that if a transformer closely approximates the rules for a finite number of particles, one can mathematically bound its error when modeling infinitely many, giving a clear guarantee on how training size controls accuracy.