ICML Kamalika Chaudhuri (UCSD): Theoretical Foundations of Memorization in Foundation Models

Invited Talk
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)

Kamalika Chaudhuri (UCSD): Theoretical Foundations of Memorization in Foundation Models

Kamalika Chaudhuri

[ Abstract ]

Abstract:

Large foundation models are known to memorize and regurgitate their training data, leading to privacy and other concerns. In this talk, I will look at two theoretically inspired ways of measuring memorization in large foundation models. First, I will look at membership inference, which is used as a proxy for memorization in classification models. I talk about using discrepancy theory to design an improved membership inference method. Next, I will look at a specific kind of memorization in generative models that we call data-copying, and investigate its properties.

Chat is not available.

Invited Talk in Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)

Kamalika Chaudhuri (UCSD): Theoretical Foundations of Memorization in Foundation Models

Kamalika Chaudhuri

Invited Talk
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)