Timezone: »
Poster
Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang
A central problem in mining massive data streams is characterizing which functions of an underlying frequency vector can be approximated efficiently. Given the prevalence of large scale linear algebra problems in machine learning, recently there has been considerable effort in extending this data stream problem to that of estimating functions of a matrix. This setting generalizes classical problems to the analogous ones for matrices. For example, instead of estimating frequent-item counts, we now wish to estimate ``frequent-direction'' counts. A related example is to estimate norms, which now correspond to estimating a vector norm on the singular values of the matrix. Despite recent efforts, the current understanding for such matrix problems is considerably weaker than that for vector problems. We study a number of aspects of estimating matrix norms in a stream that have not previously been considered: (1) multi-pass algorithms, (2) algorithms that see the underlying matrix one row at a time, and (3) time-efficient algorithms. Our multi-pass and row-order algorithms use less memory than what is provably required in the single-pass and entrywise-update models, and thus give separations between these models (in terms of memory). Moreover, all of our algorithms are considerably faster than previous ones. We also prove a number of lower bounds, and obtain for instance, a near-complete characterization of the memory required of row-order algorithms for estimating Schatten $p$-norms of sparse matrices. We complement our results with numerical experiments.
Author Information
Vladimir Braverman (Johns Hopkins University)
Stephen Chestnut (ETH Zurich)
Robert Krauthgamer (Weizmann Institute of Science)
Yi Li (Nanyang Technological University)
David Woodruff (Carnegie Mellon University)
Lin Yang (Princeton)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Thu. Jul 12th 12:20 -- 12:30 PM Room K11
More from the Same Authors
-
2021 : Adversarial Robustness of Streaming Algorithms through Importance Sampling »
Vladimir Braverman · Avinatan Hasidim · Yossi Matias · Mariano Schain · Sandeep Silwal · Samson Zhou -
2021 : Bi-directional Adaptive Communication for Heterogenous Distributed Learning »
Dmitrii Avdiukhin · Vladimir Braverman -
2021 : Gap-Dependent Unsupervised Exploration for Reinforcement Learning »
Jingfeng Wu · Vladimir Braverman · Lin Yang -
2022 : The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2023 Poster: Improved Algorithms for White-Box Adversarial Streams »
Ying Feng · David Woodruff -
2023 Poster: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron »
Jingfeng Wu · Difan Zou · Zixiang Chen · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2023 Poster: Provable Data Subset Selection For Efficient Neural Networks Training »
Morad Tukan · Samson Zhou · Alaa Maalouf · Daniela Rus · Vladimir Braverman · Dan Feldman -
2023 Poster: AutoCoreset: An Automatic Practical Coreset Construction Framework »
Alaa Maalouf · Morad Tukan · Vladimir Braverman · Daniela Rus -
2023 Oral: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization »
Ameya Velingker · Maximilian Vötsch · David Woodruff · Samson Zhou -
2022 Poster: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Oral: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Poster: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2022 Spotlight: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2020 Poster: Coresets for Clustering in Graphs of Bounded Treewidth »
Daniel Baker · Vladimir Braverman · Lingxiao Huang · Shaofeng H.-C. Jiang · Robert Krauthgamer · Xuan Wu -
2020 Poster: Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension »
Vladimir Braverman · Robert Krauthgamer · Aditya Krishnan · Roi Sinoff -
2020 Poster: Obtaining Adjustable Regularization for Free via Iterate Averaging »
Jingfeng Wu · Vladimir Braverman · Lin Yang -
2020 Poster: On the Noisy Gradient Descent that Generalizes as SGD »
Jingfeng Wu · Wenqing Hu · Haoyi Xiong · Jun Huan · Vladimir Braverman · Zhanxing Zhu -
2020 Poster: FetchSGD: Communication-Efficient Federated Learning with Sketching »
Daniel Rothchild · Ashwinee Panda · Enayat Ullah · Nikita Ivkin · Ion Stoica · Vladimir Braverman · Joseph E Gonzalez · Raman Arora -
2019 Poster: Coresets for Ordered Weighted Clustering »
Vladimir Braverman · Shaofeng Jiang · Robert Krauthgamer · Xuan Wu -
2019 Poster: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Poster: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Poster: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Oral: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Oral: Coresets for Ordered Weighted Clustering »
Vladimir Braverman · Shaofeng Jiang · Robert Krauthgamer · Xuan Wu -
2019 Oral: Dimensionality Reduction for Tukey Regression »
Kenneth Clarkson · Ruosong Wang · David Woodruff -
2019 Oral: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Poster: Sample-Optimal Parametric Q-Learning Using Linearly Additive Features »
Lin Yang · Mengdi Wang -
2019 Oral: Sample-Optimal Parametric Q-Learning Using Linearly Additive Features »
Lin Yang · Mengdi Wang -
2018 Poster: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2018 Oral: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2017 Poster: Clustering High Dimensional Dynamic Data Streams »
Lin Yang · Harry Lang · Christian Sohler · Vladimir Braverman · Gereon Frahling -
2017 Talk: Clustering High Dimensional Dynamic Data Streams »
Lin Yang · Harry Lang · Christian Sohler · Vladimir Braverman · Gereon Frahling