Timezone: »
Poster
Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization
Ameya Velingker · Maximilian Vötsch · David Woodruff · Samson Zhou
We introduce efficient $(1+\varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem, where the inputs are a matrix $\mathbf{A}\in\{0,1\}^{n\times d}$, a rank parameter $k>0$, as well as an accuracy parameter $\varepsilon>0$, and the goal is to approximate $\mathbf{A}$ as a product of low-rank factors $\mathbf{U}\in\{0,1\}^{n\times k}$ and $\mathbf{V}\in\{0,1\}^{k\times d}$. Equivalently, we want to find $\mathbf{U}$ and $\mathbf{V}$ that minimize the Frobenius loss $\|\mathbf{U}\mathbf{V} - \mathbf{A}\|_F^2$. Before this work, the state-of-the-art for this problem was the approximation algorithm of Kumar et. al. [ICML 2019], which achieves a $C$-approximation for some constant $C\ge 576$. We give the first $(1+\varepsilon)$-approximation algorithm using running time singly exponential in $k$, where $k$ is typically a small integer. Our techniques generalize to other common variants of the BMF problem, admitting bicriteria $(1+\varepsilon)$-approximation algorithms for $L_p$ loss functions and the setting where matrix operations are performed in $\mathbb{F}_2$. Our approach can be implemented in standard big data models, such as the streaming or distributed models.
Author Information
Ameya Velingker (Google Research)
Maximilian Vötsch (University of Vienna)
David Woodruff (Carnegie Mellon University)
Samson Zhou (UC Berkeley)
More from the Same Authors
-
2021 : Adversarial Robustness of Streaming Algorithms through Importance Sampling »
Vladimir Braverman · Avinatan Hasidim · Yossi Matias · Mariano Schain · Sandeep Silwal · Samson Zhou -
2023 : Efficient Location Sampling Algorithms for Road Networks »
Sara Ahmadian · Kostas Kollias · Ameya Velingker · Sreenivas Gollapudi · Vivek Kumar · Santhoshini Velusamy -
2023 Poster: Improved Algorithms for White-Box Adversarial Streams »
Ying Feng · David Woodruff -
2023 Poster: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Exphormer: Sparse Transformers for Graphs »
Hamed Shirzad · Ameya Velingker · Balaji Venkatachalam · Danica J Sutherland · Ali K Sinop -
2023 Poster: Provable Data Subset Selection For Efficient Neural Networks Training »
Morad Tukan · Samson Zhou · Alaa Maalouf · Daniela Rus · Vladimir Braverman · Dan Feldman -
2023 Oral: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2022 Poster: Hardness and Algorithms for Robust and Sparse Optimization »
Eric Price · Sandeep Silwal · Samson Zhou -
2022 Spotlight: Hardness and Algorithms for Robust and Sparse Optimization »
Eric Price · Sandeep Silwal · Samson Zhou -
2022 Poster: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2022 Spotlight: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2019 Poster: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Poster: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Poster: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Oral: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Oral: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Oral: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2018 Poster: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Oral: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Poster: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2018 Oral: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff