Timezone: »
Spotlight
Oblivious Sketching for Logistic Regression
Alexander Munteanu · Simon Omlor · David Woodruff
What guarantees are possible for solving logistic regression in one pass over a data stream? To answer this question, we present the first data oblivious sketch for logistic regression. Our sketch can be computed in input sparsity time over a turnstile data stream and reduces the size of a $d$-dimensional data set from $n$ to only $\operatorname{poly}(\mu d\log n)$ weighted points, where $\mu$ is a useful parameter which captures the complexity of compressing the data. Solving (weighted) logistic regression on the sketch gives an $O(\log n)$-approximation to the original problem on the full data set. We also show how to obtain an $O(1)$-approximation with slight modifications. Our sketches are fast, simple, easy to implement, and our experiments demonstrate their practicality.
Author Information
Alexander Munteanu (TU Dortmund)
Simon Omlor (TU Dortmund)
David Woodruff (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Oblivious Sketching for Logistic Regression »
Thu. Jul 22nd 04:00 -- 06:00 PM Room
More from the Same Authors
-
2022 Poster: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Poster: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Poster: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Spotlight: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Spotlight: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Spotlight: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Poster: Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis »
Alexander Munteanu · Simon Omlor · Zhao Song · David Woodruff -
2022 Spotlight: Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis »
Alexander Munteanu · Simon Omlor · Zhao Song · David Woodruff -
2021 Poster: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Poster: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Poster: Dimensionality Reduction for the Sum-of-Distances Metric »
Zhili Feng · Praneeth Kacham · David Woodruff -
2021 Poster: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Spotlight: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Oral: Dimensionality Reduction for the Sum-of-Distances Metric »
Zhili Feng · Praneeth Kacham · David Woodruff -
2021 Poster: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2021 Spotlight: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2020 Poster: Input-Sparsity Low Rank Approximation in Schatten Norm »
Yi Li · David Woodruff -
2019 Poster: A Framework for Bayesian Optimization in Embedded Subspaces »
Amin Nayebi · Alexander Munteanu · Matthias Poloczek -
2019 Oral: A Framework for Bayesian Optimization in Embedded Subspaces »
Amin Nayebi · Alexander Munteanu · Matthias Poloczek