Skip to yearly menu bar Skip to main content

Contributed Talk
Workshop: Theory and Practice of Differential Privacy

Differentially Private Sampling from Distributions

Satchit Sivakumar · Marika Swanberg · Sofya Raskhodnikova · Adam Smith

Abstract: We initiate an investigation of private sampling from distributions. Given a dataset with $n$ independent observations from an unknown distribution $P$, a sampling algorithm must output a single observation from a distribution that is close in total variation distance to $P$ while satisfying differential privacy. Sampling abstracts the goal of generating small amounts of realistic-looking data. We provide upper and lower bounds for the dataset size needed for this task for two natural families of distributions: arbitrary distributions on $\{1,\ldots ,k\}$ and product distributions on $\{0,1\}^d$. We demonstrate that, in some parameter regimes, private sampling requires asymptotically fewer observations than learning a description of $P$ nonprivately; in other regimes, however, sampling proves to be as difficult as private learning.