Timezone: »
Gaussian mixture models (GMM) are the most widely used statistical model for the k-means clustering problem and form a popular framework for clustering in machine learning and data analysis. In this paper, we propose a natural robust model for k-means clustering that generalizes the Gaussian mixture model, and that we believe will be useful in identifying robust algorithms. Our first contribution is a polynomial time algorithm that provably recovers the ground-truth up to small classification error w.h.p., assuming certain separation between the components. Perhaps surprisingly, the algorithm we analyze is the popular Lloyd's algorithm for k-means clustering that is the method-of-choice in practice. Our second result complements the upper bound by giving a nearly matching lower bound on the number of misclassified points incurred by any k-means clustering algorithm on the semi-random model.
Author Information
Aravindan Vijayaraghavan (Northwestern University)
Pranjal Awasthi (Rutgers University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Clustering Semi-Random Mixtures of Gaussians »
Wed. Jul 11th 04:15 -- 07:00 PM Room Hall B #39
More from the Same Authors
-
2021 Poster: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2021 Oral: Graph Cuts Always Find a Global Optimum for Potts Models (With a Catch) »
Hunter Lang · David Sontag · Aravindan Vijayaraghavan -
2020 Poster: Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2019 Poster: Fair k-Center Clustering for Data Summarization »
Matthäus Kleindessner · Pranjal Awasthi · Jamie Morgenstern -
2019 Poster: Guarantees for Spectral Clustering with Fairness Constraints »
Matthäus Kleindessner · Samira Samadi · Pranjal Awasthi · Jamie Morgenstern -
2019 Oral: Guarantees for Spectral Clustering with Fairness Constraints »
Matthäus Kleindessner · Samira Samadi · Pranjal Awasthi · Jamie Morgenstern -
2019 Oral: Fair k-Center Clustering for Data Summarization »
Matthäus Kleindessner · Pranjal Awasthi · Jamie Morgenstern -
2018 Poster: Crowdsourcing with Arbitrary Adversaries »
Matthäus Kleindessner · Pranjal Awasthi -
2018 Oral: Crowdsourcing with Arbitrary Adversaries »
Matthäus Kleindessner · Pranjal Awasthi