Timezone: »
The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space
Adam Smith · Shuang Song · Abhradeep Guha Thakurta
We revisit the problem of counting the number of distinct elements $\dist$ in a data stream $D$, over a domain $[u]$. We propose an $(\epsilon,\delta)$-differentially private algorithm that approximates $\dist$ within a factor of $(1\pm\gamma)$, and with additive error of $O(\sqrt{\ln(1/\delta)}/\epsilon)$, using space $O(\ln(\ln(u)/\gamma)/\gamma^2)$. We improve on the prior work at least quadratically and up to exponentially, in terms of both space and additive error. Our additive error guarantee is optimal up to a factor of $O(\sqrt{\ln(1/\delta)})$, and the space bound is optimal up to a factor of $O\left(\min\left\{\ln\left(\frac{\ln(u)}{\gamma}\right), \frac{1}{\gamma^2}\right\}\right)$. We assume the existence of an ideal uniform random hash function, and ignore the space required to store it. We later relax this requirement by assuming pseudorandom functions and appealing to a computational variant of differential privacy, SIM-CDP.
Our algorithm is built on top of the celebrated Flajolet-Martin (FM) sketch. We show that FM-sketch is differentially private as is, as long as there are $\approx \sqrt{\ln(1/\delta)}/(\epsilon\gamma)$ distinct elements in the data set. Along the way, we prove a structural result showing that the maximum of $k$ i.i.d. random variables is statistically close (in the sense of $\epsilon$-differential privacy) to the maximum of $(k+1)$ i.i.d. samples from the same distribution, as long as $k=\Omega\left(\frac{1}{\epsilon}\right)$.
Finally, experiments show that our algorithms introduces error within an order of magnitude of the non-private analogues for streams with thousands of distinct elements, even while providing strong privacy guarantee ($\eps\leq 1$).
Author Information
Adam Smith (Boston University)
Shuang Song (Google)
Abhradeep Guha Thakurta (Google)
More from the Same Authors
-
2021 : Nonparametric Differentially Private Confidence Intervals for the Median »
Joerg Drechsler · Ira Globus-Harris · Audra McMillan · Adam Smith · Jayshree Sarathy -
2021 : Practical and Private (Deep) Learning without Sampling orShuffling »
Peter Kairouz · Hugh B McMahan · Shuang Song · Om Dipakbhai Thakkar · Abhradeep Guha Thakurta · Zheng Xu -
2021 : Differentially Private Model Personalization »
Prateek Jain · J K Rush · Adam Smith · Shuang Song · Abhradeep Guha Thakurta -
2021 : When Is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning? »
Gavin Brown · Mark Bun · Vitaly Feldman · Adam Smith · Kunal Talwar -
2021 : Differentially Private Sampling from Distributions »
Satchit Sivakumar · Marika Swanberg · Sofya Raskhodnikova · Adam Smith -
2021 : Covariance-Aware Private Mean Estimation Without Private Covariance Estimation »
Gavin Brown · Marco Gaboradi · Adam Smith · Jonathan Ullman · Lydia Zakynthinou -
2021 : Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates »
Steve Chien · Prateek Jain · Walid Krichene · Steffen Rendle · Shuang Song · Abhradeep Guha Thakurta · Li Zhang -
2023 Poster: The Price of Differential Privacy under Continual Observation »
Palak Jain · Sofya Raskhodnikova · Satchit Sivakumar · Adam Smith -
2023 Poster: Why Is Public Pretraining Necessary for Private Model Training? »
Arun Ganesh · Mahdi Haghifam · Milad Nasresfahani · Sewoong Oh · Thomas Steinke · Om Thakkar · Abhradeep Guha Thakurta · Lun Wang -
2023 Oral: Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning »
Christopher Choquette-Choo · Hugh B McMahan · J K Rush · Abhradeep Guha Thakurta -
2023 Oral: The Price of Differential Privacy under Continual Observation »
Palak Jain · Sofya Raskhodnikova · Satchit Sivakumar · Adam Smith -
2023 Poster: Multi-Task Differential Privacy Under Distribution Skew »
Walid Krichene · Prateek Jain · Shuang Song · Mukund Sundararajan · Abhradeep Guha Thakurta · Li Zhang -
2023 Poster: Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning »
Christopher Choquette-Choo · Hugh B McMahan · J K Rush · Abhradeep Guha Thakurta -
2022 Poster: Public Data-Assisted Mirror Descent for Private Model Training »
Ehsan Amid · Arun Ganesh · Rajiv Mathews · Swaroop Ramaswamy · Shuang Song · Thomas Steinke · Thomas Steinke · Vinith Suriyakumar · Om Thakkar · Abhradeep Guha Thakurta -
2022 Spotlight: Public Data-Assisted Mirror Descent for Private Model Training »
Ehsan Amid · Arun Ganesh · Rajiv Mathews · Swaroop Ramaswamy · Shuang Song · Thomas Steinke · Thomas Steinke · Vinith Suriyakumar · Om Thakkar · Abhradeep Guha Thakurta -
2021 Poster: Practical and Private (Deep) Learning Without Sampling or Shuffling »
Peter Kairouz · Brendan McMahan · Shuang Song · Om Dipakbhai Thakkar · Abhradeep Guha Thakurta · Zheng Xu -
2021 Spotlight: Practical and Private (Deep) Learning Without Sampling or Shuffling »
Peter Kairouz · Brendan McMahan · Shuang Song · Om Dipakbhai Thakkar · Abhradeep Guha Thakurta · Zheng Xu -
2021 Poster: Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates »
Steve Chien · Prateek Jain · Walid Krichene · Steffen Rendle · Shuang Song · Abhradeep Guha Thakurta · Li Zhang -
2021 Oral: Private Alternating Least Squares: Practical Private Matrix Completion with Tighter Rates »
Steve Chien · Prateek Jain · Walid Krichene · Steffen Rendle · Shuang Song · Abhradeep Guha Thakurta · Li Zhang