Timezone: »
Oral
Faster Algorithms for Binary Matrix Factorization
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff
We give faster approximation algorithms for
well-studied variants of Boolean Matrix Factorization,
where we are given a binary matrix $A \in \mathbb{R}^{m \times n}$
and would like to find binary matrices $U \in \{0,1\}^{m \times k}$
and $V \in \{0,1\}^{k \times n}$ so as to minimize $\|U \cdot V - A\|_F^2$.
In the first setting,
$U \cdot V$ denotes multiplication over the integers, and
We give an algorithm outputting
a constant factor approximation in $2^{O(k^2 \log k)} \textrm{poly}(mn)$ time, improving the previous $\min(2^{2^k}, 2^n) \textrm{poly}(mn)$ time.
Our techniques generalize to finding
$U \in \{0,1\}^{m \times k}$
and $V \in \{0,1\}^{k \times n}$ so as to approximately
minimize $\|U \cdot V - A\|_p^p$,
where $p \geq 1$ is any real number, in $2^{O(k^{\lceil p/2 \rceil + 1}\log k)} \textrm{poly}(mn)$ time.
For $p = 1$, our results have a graph-theoretic
interpretation. Namely,
given an unweighted bipartite graph $G$ with $m$ vertices in the left part and $n$ vertices in the right part,
how well can one approximate
$G$ as a disjoint union $H$ of $k$ bipartite cliques (bicliques)? A natural notion of approximation is to find a disjoint union $H$ of $k$ bicliques
so as to
minimize the number of edges in the symmetric difference, i.e.,
to find $H$ with
$$|E(G)\setminus E(H)| + |E(H)\setminus E(G)|
\leq C \min_{\textrm{a disjoint union }H' \textrm{ of } k \textrm{ cliques}}
|E(G)\setminus E(H')| + |E(H') \setminus E(G)|,$$
where $C > 1$ is a constant approximation factor.
It is known
that deciding if a bipartite graph is equal to the disjoint union of $k$
bicliques requires $2^{\Omega(k)} \cdot \textrm{poly}(mn)$ time under standard complexity-theoretic
assumptions, and so this lower bound also applies to the approximate case.
While there are algorithms to decide if $G$
can be expressed as the disjoint union of $k$ bicliques in $2^{O(k^2)} \cdot \textrm{poly}(mn)$ time,
surprisingly, no algorithms were known for the approximate version
of the problem. We give the first constant factor approximation
algorithm for this problem, running in $2^{O(k^2 \log k)} \textrm{poly}(mn)$ time.
Finally, we give the fastest known bicriteria constant factor approximation
when the multiplication $U \cdot V$ is over the finite field $GF(2)$.
We achieve $2^{O(k^3)} \poly(mn)$ time
to output binary rank $O(k \log m)$ matrices $U$ and $V$ whose cost is
as good as the best rank-$k$ approximation, improving the previous
$\min(2^{2^k}mn, \min(m,n)^{O(\poly(k))} \textrm{poly}(mn))$ time.
Author Information
Ravi Kumar (Google)
Rina Panigrahy (Google)
Ali Rahimi (Google)
David Woodruff (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Faster Algorithms for Binary Matrix Factorization »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #144
More from the Same Authors
-
2021 : Randomized Response with Prior and Applications to Learning with Label Differential Privacy »
Badih Ghazi · Noah Golowich · Ravi Kumar · Pasin Manurangsi · Chiyuan Zhang -
2021 : User-Level Private Learning via Correlated Sampling »
Badih Ghazi · Ravi Kumar · Pasin Manurangsi -
2023 Poster: Improved Algorithms for White-Box Adversarial Streams »
Ying Feng · David Woodruff -
2023 Poster: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Oral: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Bandit Online Linear Optimization with Hints and Queries »
Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit -
2023 Poster: Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization »
Ameya Velingker · Maximilian Vötsch · David Woodruff · Samson Zhou -
2023 Poster: On User-Level Private Convex Optimization »
Badih Ghazi · Pritish Kamath · Ravi Kumar · Pasin Manurangsi · Raghu Meka · Chiyuan Zhang -
2022 Poster: Parsimonious Learning-Augmented Caching »
Sungjin Im · Ravi Kumar · Aditya Petety · Manish Purohit -
2022 Poster: RUMs from Head-to-Head Contests »
Matteo Almanza · Flavio Chierichetti · Ravi Kumar · Alessandro Panconesi · Andrew Tomkins -
2022 Spotlight: RUMs from Head-to-Head Contests »
Matteo Almanza · Flavio Chierichetti · Ravi Kumar · Alessandro Panconesi · Andrew Tomkins -
2022 Spotlight: Parsimonious Learning-Augmented Caching »
Sungjin Im · Ravi Kumar · Aditya Petety · Manish Purohit -
2022 Poster: Faster Privacy Accounting via Evolving Discretization »
Badih Ghazi · Pritish Kamath · Ravi Kumar · Pasin Manurangsi -
2022 Spotlight: Faster Privacy Accounting via Evolving Discretization »
Badih Ghazi · Pritish Kamath · Ravi Kumar · Pasin Manurangsi -
2022 Poster: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2022 Spotlight: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2021 Poster: Differentially Private Aggregation in the Shuffle Model: Almost Central Accuracy in Almost a Single Message »
Badih Ghazi · Ravi Kumar · Pasin Manurangsi · Rasmus Pagh · Amer Sinha -
2021 Spotlight: Differentially Private Aggregation in the Shuffle Model: Almost Central Accuracy in Almost a Single Message »
Badih Ghazi · Ravi Kumar · Pasin Manurangsi · Rasmus Pagh · Amer Sinha -
2021 Poster: Locally Private k-Means in One Round »
Alisa Chang · Badih Ghazi · Ravi Kumar · Pasin Manurangsi -
2021 Oral: Locally Private k-Means in One Round »
Alisa Chang · Badih Ghazi · Ravi Kumar · Pasin Manurangsi -
2021 Poster: Light RUMs »
Flavio Chierichetti · Ravi Kumar · Andrew Tomkins -
2021 Spotlight: Light RUMs »
Flavio Chierichetti · Ravi Kumar · Andrew Tomkins -
2020 Poster: Online Learning with Imperfect Hints »
Aditya Bhaskara · Ashok Cutkosky · Ravi Kumar · Manish Purohit -
2020 Poster: Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead »
Badih Ghazi · Ravi Kumar · Pasin Manurangsi · Rasmus Pagh -
2019 Workshop: Identifying and Understanding Deep Learning Phenomena »
Hanie Sedghi · Samy Bengio · Kenji Hata · Aleksander Madry · Ari Morcos · Behnam Neyshabur · Maithra Raghu · Ali Rahimi · Ludwig Schmidt · Ying Xiao -
2019 Poster: Recursive Sketches for Modular Deep Learning »
Badih Ghazi · Rina Panigrahy · Joshua Wang -
2019 Poster: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Poster: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Oral: Recursive Sketches for Modular Deep Learning »
Badih Ghazi · Rina Panigrahy · Joshua Wang -
2019 Oral: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Oral: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2018 Poster: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Oral: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Poster: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2018 Poster: Learning a Mixture of Two Multinomial Logits »
Flavio Chierichetti · Ravi Kumar · Andrew Tomkins -
2018 Oral: Learning a Mixture of Two Multinomial Logits »
Flavio Chierichetti · Ravi Kumar · Andrew Tomkins -
2018 Oral: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2017 Poster: Algorithms for $\ell_p$ Low-Rank Approximation »
Flavio Chierichetti · Sreenivas Gollapudi · Ravi Kumar · Silvio Lattanzi · Rina Panigrahy · David Woodruff -
2017 Talk: Algorithms for $\ell_p$ Low-Rank Approximation »
Flavio Chierichetti · Sreenivas Gollapudi · Ravi Kumar · Silvio Lattanzi · Rina Panigrahy · David Woodruff