Timezone: »
Oral
Dimensionality Reduction for the Sum-of-Distances Metric
Zhili Feng · Praneeth Kacham · David Woodruff
We give a dimensionality reduction procedure to approximate the sum of distances of a given set of $n$ points in $R^d$ to any ``shape'' that lies in a $k$-dimensional subspace. Here, by ``shape'' we mean any set of points in $R^d$. Our algorithm takes an input in the form of an $n \times d$ matrix $A$, where each row of $A$ denotes a data point, and outputs a subspace $P$ of dimension $O(k^{3}/\epsilon^6)$ such that the projections of each of the $n$ points onto the subspace $P$ and the distances of each of the points to the subspace $P$ are sufficient to obtain an $\epsilon$-approximation to the sum of distances to any arbitrary shape that lies in a $k$-dimensional subspace of $R^d$. These include important problems such as $k$-median, $k$-subspace approximation, and $(j,l)$ subspace clustering with $j \cdot l \leq k$. Dimensionality reduction reduces the data storage requirement to $(n+d)k^{3}/\epsilon^6$ from nnz$(A)$.
Here nnz$(A)$ could potentially be as large as $nd$. Our algorithm runs in time nnz$(A)/\epsilon^2 + (n+d)$poly$(k/\epsilon)$, up to logarithmic factors. For dense matrices, where nnz$(A) \approx nd$, we give a faster algorithm, that runs in time $nd + (n+d)$poly$(k/\epsilon)$ up to logarithmic factors. Our dimensionality reduction algorithm can also be used to obtain poly$(k/\epsilon)$ size coresets for $k$-median and $(k,1)$-subspace approximation problems in polynomial time.
Author Information
Zhili Feng (Carnegie Mellon University)
Praneeth Kacham (Carnegie Mellon University)
David Woodruff (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Dimensionality Reduction for the Sum-of-Distances Metric »
Fri. Jul 23rd 04:00 -- 06:00 AM Room Virtual
More from the Same Authors
-
2022 Poster: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Poster: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Poster: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Spotlight: Sketching Algorithms and Lower Bounds for Ridge Regression »
Praneeth Kacham · David Woodruff -
2022 Spotlight: Learning Augmented Binary Search Trees »
Honghao Lin · Tian Luo · David Woodruff -
2022 Spotlight: Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time »
David Woodruff · Amir Zandieh -
2022 Poster: Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis »
Alexander Munteanu · Simon Omlor · Zhao Song · David Woodruff -
2022 Spotlight: Bounding the Width of Neural Networks via Coupled Initialization - A Worst Case Analysis »
Alexander Munteanu · Simon Omlor · Zhao Song · David Woodruff -
2021 Poster: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Poster: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Poster: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Streaming and Distributed Algorithms for Robust Column Subset Selection »
Shuli Jiang · Dongyu Li · Irene Mengze Li · Arvind Mahankali · David Woodruff -
2021 Spotlight: Fast Sketching of Polynomial Kernels of Polynomial Degree »
Zhao Song · David Woodruff · Zheng Yu · Lichen Zhang -
2021 Spotlight: Single Pass Entrywise-Transformed Low Rank Approximation »
Yifei Jiang · Yi Li · Yiming Sun · Jiaxin Wang · David Woodruff -
2021 Poster: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2021 Poster: Oblivious Sketching for Logistic Regression »
Alexander Munteanu · Simon Omlor · David Woodruff -
2021 Spotlight: Oblivious Sketching for Logistic Regression »
Alexander Munteanu · Simon Omlor · David Woodruff -
2021 Spotlight: In-Database Regression in Input Sparsity Time »
Rajesh Jayaram · Alireza Samadian · David Woodruff · Peng Ye -
2020 Poster: Input-Sparsity Low Rank Approximation in Schatten Norm »
Yi Li · David Woodruff