Poster
Differentially Private Federated $k$-Means Clustering with Server-Side Data
Jonathan Scott · Christoph Lampert · David Saulpic
East Exhibition Hall A-B #E-901
Clustering is a technique used to group similar items in large, unlabeled datasets. Traditional clustering methods assume that all the data is stored in one central location. However, in today's world, data is often generated and stored across many separate devices, like smartphones, and privacy concerns often prevent this data from being shared.To address this challenge, we introduce a new method that allows devices to collaborate on clustering without sharing their raw data. This approach protects user privacy using a technique called differential privacy, which ensures that nothing specific about any individual device can be inferred from the final clustering results. A key part of our method is using a small amount of publicly available or server-side data to help kick-start the clustering process, which is then refined collaboratively.We support our method with both a theoretical analysis of its performance and an experimental evaluation that demonstrates its practical usefulness.
Live content is unavailable. Log in and register to view live content