Invited Talk
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)
Kamalika Chaudhuri (UCSD): Theoretical Foundations of Memorization in Foundation Models
Kamalika Chaudhuri
Abstract:
Large foundation models are known to memorize and regurgitate their training data, leading to privacy and other concerns. In this talk, I will look at two theoretically inspired ways of measuring memorization in large foundation models. First, I will look at membership inference, which is used as a proxy for memorization in classification models. I talk about using discrepancy theory to design an improved membership inference method. Next, I will look at a specific kind of memorization in generative models that we call data-copying, and investigate its properties.
Chat is not available.