We address the statistical and optimization impacts of using classical sketch versus Hessian sketch to solve approximately the Matrix Ridge Regression (MRR) problem. Prior research has considered the effects of classical sketch on least squares regression (LSR), a strictly simpler problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does on those of LSRnamely, it recovers nearly optimal solutions. In contrast, Hessian sketch does not have this guarantee; instead, the approximation error is governed by a subtle interplay between the ``mass'' in the responses and the optimal objective value. For both types of approximations, the regularization in the sketched MRR problem gives it significantly different statistical properties from the sketched LSR problem. In particular, there is a biasvariance tradeoff in sketched MRR that is not present in sketched LSR. We provide upper and lower bounds on the biases and variances of sketched MRR; these establish that the variance is significantly increased when classical sketches are used, while the bias is significantly increased when using Hessian sketches. Empirically, sketched MRR solutions can have risks that are higher by an orderofmagnitude than those of the optimal MRR solutions. We establish theoretically and empirically that model averaging greatly decreases this gap. Thus, in the distributed setting, sketching combined with model averaging is a powerful technique that quickly obtains nearoptimal solutions to the MRR problem while greatly mitigating the statistical risks incurred by sketching.
Author Information
Shusen Wang (UC Berkeley)
Alex Gittens (Rensselaer Polytechnic Institute)
Alex Gittens's research focuses on using randomization to reduce the computational costs of extracting information from large datasets. His work lies at the intersection of randomized algorithms, numerical linear algebra, highdimensional probability, and machine learning.
Michael Mahoney (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)

2017 Poster: Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging »
Wed Aug 9th 06:30  10:00 PM Room Gallery
More from the Same Authors

2019 Poster: Traditional and Heavy Tailed Self Regularization in Neural Network Models »
Michael Mahoney · Charles H Martin 
2019 Oral: Traditional and Heavy Tailed Self Regularization in Neural Network Models »
Michael Mahoney · Charles H Martin 
2018 Poster: Outofsample extension of graph adjacency spectral embedding »
Keith Levin · Fred Roosta · Michael Mahoney · Carey Priebe 
2018 Oral: Outofsample extension of graph adjacency spectral embedding »
Keith Levin · Fred Roosta · Michael Mahoney · Carey Priebe 
2018 Poster: Error Estimation for Randomized LeastSquares Algorithms via the Bootstrap »
Miles Lopes · Shusen Wang · Michael Mahoney 
2018 Oral: Error Estimation for Randomized LeastSquares Algorithms via the Bootstrap »
Miles Lopes · Shusen Wang · Michael Mahoney 
2017 Poster: Capacity Releasing Diffusion for Speed and Locality. »
Di Wang · Kimon Fountoulakis · Monika Henzinger · Michael Mahoney · Satish Rao 
2017 Talk: Capacity Releasing Diffusion for Speed and Locality. »
Di Wang · Kimon Fountoulakis · Monika Henzinger · Michael Mahoney · Satish Rao