Track: Privacy, Anonymity, and Security 1

Wed 11 July 7:00 - 7:20 PDT

Differentially Private Matrix Completion Revisited

Prateek Jain · Om Dipakbhai Thakkar · Abhradeep Thakurta

We provide the first provably joint differentially private algorithm with formal utility guarantees for the problem of user-level privacy-preserving collaborative filtering. Our algorithm is based on the Frank-Wolfe method, and it consistently estimates the underlying preference matrix as long as the number of users $m$ is $\omega(n^{5/4})$, where $n$ is the number of items, and each user provides her preference for at least $\sqrt{n}$ randomly selected items. Along the way, we provide an optimal differentially private algorithm for singular vector computation, based on the celebrated Oja's method, that provides significant savings in terms of space and time while operating on sparse matrices. We also empirically evaluate our algorithm on a suite of datasets, and show that it consistently outperforms the state-of-the-art private algorithms.

Wed 11 July 7:20 - 7:30 PDT

Mitigating Bias in Adaptive Data Gathering via Differential Privacy

Seth Neel · Aaron Roth

Data that is gathered adaptively --- via bandit algorithms, for example --- exhibits bias. This is true both when gathering simple numeric valued data --- the empirical means kept track of by stochastic bandit algorithms are biased downwards --- and when gathering more complicated data --- running hypothesis tests on complex data gathered via contextual bandit algorithms leads to false discovery. In this paper, we show that this problem is mitigated if the data collection procedure is differentially private. This lets us both bound the bias of simple numeric valued quantities (like the empirical means of stochastic bandit algorithms), and correct the p-values of hypothesis tests run on the adaptively gathered data. Moreover, there exist differentially private bandit algorithms with near optimal regret bounds: we apply existing theorems in the simple stochastic case, and give a new analysis for linear contextual bandits. We complement our theoretical results with experiments validating our theory.

Wed 11 July 7:30 - 7:40 PDT

Local Private Hypothesis Testing: Chi-Square Tests

Marco Gaboardi · Ryan Rogers

The local model for differential privacy is emerging as the reference model for practical applications of collecting and sharing sensitive information while satisfyingstrong privacy guarantees. In the local model, there is no trusted entity which is allowed to have each individual's raw data as is assumed in the traditional curator model. Individuals' data are usually perturbed before sharing them. We explore the design of private hypothesis tests in the local model, where each data entry is perturbed to ensure the privacy of each participant. Specifically, we analyze locally private chi-square tests for goodness of fit and independence testing.

Wed 11 July 7:40 - 7:50 PDT

Locally Private Hypothesis Testing

Or Sheffet

We initiate the study of differentially private hypothesis testing in the local-model, under both the standard (symmetric) randomized-response mechanism (Warner 1965, Kasiviswanathan et al, 2008} and the newer (non-symmetric) mechanisms (Bassily & Smith, 2015, Bassily et al, 2017). First, we study the general framework of mapping each user's type into a signal and show that the problem of finding the maximum-likelihood distribution over the signals is feasible. Then we discuss the randomized-response mechanism and show that, in essence, it maps the null- and alternative-hypotheses onto new sets, an affine translation of the original sets. We then give sample complexity bounds for identity and independence testing under randomized-response. We then move to the newer non-symmetric mechanisms and show that there too the problem of finding the maximum-likelihood distribution is feasible. Under the mechanism of Bassily et al we give identity and independence testers with better sample complexity than the testers in the symmetric case, and we also propose a $\chi^2$-based identity tester which we investigate empirically.

Wed 11 July 7:50 - 8:00 PDT

INSPECTRE: Privately Estimating the Unseen

Jayadev Acharya · Gautam Kamath · Ziteng Sun · Huanyu Zhang

We develop differentially private methods for estimating various distributional properties.Given a sample from a discrete distribution p, some functional f, and accuracy and privacy parameters alpha and epsilon, the goal is to estimate f(p) up to accuracy alpha, while maintaining epsilon-differential privacy of the sample.We prove almost-tight bounds on the sample size required for this problem for several functionals of interest, including support size, support coverage, and entropy.We show that the cost of privacy is negligible in a variety of settings, both theoretically and experimentally.Our methods are based on a sensitivity analysis of several state-of-the-art methods for estimating these properties with sublinear sample complexities

Main Navigation

Session

Privacy, Anonymity, and Security 1

Differentially Private Matrix Completion Revisited

Mitigating Bias in Adaptive Data Gathering via Differential Privacy

Local Private Hypothesis Testing: Chi-Square Tests

Locally Private Hypothesis Testing

INSPECTRE: Privately Estimating the Unseen