Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)

Kamalika Chaudhuri (UCSD): Theoretical Foundations of Memorization in Foundation Models

Kamalika Chaudhuri

[ ]
Sat 27 Jul 6 a.m. PDT — 6:30 a.m. PDT

Abstract:

Large foundation models are known to memorize and regurgitate their training data, leading to privacy and other concerns. In this talk, I will look at two theoretically inspired ways of measuring memorization in large foundation models. First, I will look at membership inference, which is used as a proxy for memorization in classification models. I talk about using discrepancy theory to design an improved membership inference method. Next, I will look at a specific kind of memorization in generative models that we call data-copying, and investigate its properties.

Chat is not available.