Spotlight Poster
An analytic theory of creativity in convolutional diffusion models
Mason Kamb · Surya Ganguli
East Exhibition Hall A-B #E-2212
Tue 15 Jul 10 a.m. PDT — 11 a.m. PDT
Modern generative AI is capable of producing a seemingly unlimited amount of apparently "creative" output, showing the capacity to mix and match features from the data it was trained on in novel and often unpredictable ways. Understanding how this process occurs, and how the outputs that these models produce relates to the task that they were trained to performed and the data that they learned from, is a key question for understanding the nature of artificial intelligence. To understand where this "creativity" emerges from, we decided to study the simplest models that we could find that exhibited this ability, called a "convolutional diffusion model." We developed a mathematical theory to explain their behavior, based on a handful of properties that they exhibit. This theory predicted that, while "smarter" models might be able to recall their training data, these simple models could only "mix and match" bits and pieces of the dataset at a time-- forming "patchwork quilts" of all of the images that they had ever seen in their training set. While seemingly far-fetched, this theory was remarkably predictive, and in fact, we were able to reproduce almost exactly the images that they produced, directly from their training data-- a first in the field of generative AI. Our theory also explained why AI makes certain common mistakes when generating images, such as putting in incorrect numbers of limbs.
Live content is unavailable. Log in and register to view live content