Timezone: »
Oral
Dimensionality Reduction for Tukey Regression
Kenneth Clarkson · Ruosong Wang · David Woodruff
We give the first dimensionality reduction methods for the overconstrained Tukey regression problem. The Tukey loss function $\|y\|_M = \sum_i M(y_i)$ has $M(y_i) \approx |y_i|^p$ for residual errors $y_i$ smaller than a prescribed threshold $\tau$, but $M(y_i)$ becomes constant for errors $|y_i| > \tau$. Our results depend on a new structural result, proven constructively, showing that for any $d$-dimensional subspace $L \subset \mathbb{R}^n$, there is a fixed bounded-size subset of coordinates containing, for every $y\in L$, all the large coordinates, {\it with respect to the Tukey loss function}, of~$y$. Our methods reduce a given Tukey regression problem to a smaller weighted version, whose solution is a provably good approximate solution to the original problem. Our reductions are fast, simple, and easy to implement, and we give empirical results demonstrating their practicality, using existing heuristic solvers for the small versions. We also give exponential-time algorithms giving provably good solutions, and hardness results suggesting that a significant speedup in the worst case is unlikely.
Author Information
Kenneth Clarkson (IBM Research)
Ruosong Wang (Carnegie Mellon University)
David Woodruff (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Dimensionality Reduction for Tukey Regression »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #208
More from the Same Authors
-
2021 : Online Sub-Sampling for Reinforcement Learning with General Function Approximation »
Dingwen Kong · Ruslan Salakhutdinov · Ruosong Wang · Lin Yang -
2023 Poster: Improved Algorithms for White-Box Adversarial Streams »
Ying Feng · David Woodruff -
2023 Poster: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Oral: Sharper Bounds for $\ell_p$ Sensitivity Sampling »
David Woodruff · Taisuke Yasuda -
2023 Poster: Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix Factorization »
Ameya Velingker · Maximilian Vötsch · David Woodruff · Samson Zhou -
2023 Poster: Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes »
Runlong Zhou · Ruosong Wang · Simon Du -
2022 Poster: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2022 Spotlight: Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra »
Nadiia Chepurko · Kenneth Clarkson · Lior Horesh · Honghao Lin · David Woodruff -
2021 Poster: Projection techniques to update the truncated SVD of evolving matrices with applications »
Vasileios Kalantzis · Georgios Kollias · Shashanka Ubaru · Athanasios N. Nikolakopoulos · Lior Horesh · Kenneth Clarkson -
2021 Spotlight: Projection techniques to update the truncated SVD of evolving matrices with applications »
Vasileios Kalantzis · Georgios Kollias · Shashanka Ubaru · Athanasios N. Nikolakopoulos · Lior Horesh · Kenneth Clarkson -
2021 Poster: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2021 Poster: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Spotlight: Instabilities of Offline RL with Pre-Trained Neural Representation »
Ruosong Wang · Yifan Wu · Ruslan Salakhutdinov · Sham Kakade -
2021 Oral: Bilinear Classes: A Structural Framework for Provable Generalization in RL »
Simon Du · Sham Kakade · Jason Lee · Shachar Lovett · Gaurav Mahajan · Wen Sun · Ruosong Wang -
2020 Poster: Nearly Linear Row Sampling Algorithm for Quantile Regression »
Yi Li · Ruosong Wang · Lin Yang · Hanrui Zhang -
2019 Poster: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Poster: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Oral: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Oral: Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering »
Taisuke Yasuda · David Woodruff · Manuel Fernandez -
2019 Poster: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2019 Oral: Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks »
Sanjeev Arora · Simon Du · Wei Hu · Zhiyuan Li · Ruosong Wang -
2018 Poster: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Oral: Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order »
Vladimir Braverman · Stephen Chestnut · Robert Krauthgamer · Yi Li · David Woodruff · Lin Yang -
2018 Poster: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff -
2018 Oral: Leveraging Well-Conditioned Bases: Streaming and Distributed Summaries in Minkowski $p$-Norms »
Charlie Dickens · Graham Cormode · David Woodruff