Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

David Woodruff · Amir Zandieh

Hall E #525

Keywords: [ MISC: General Machine Learning Techniques ] [ MISC: Scalable Algorithms ] [ MISC: Supervised Learning ] [ PM: Monte Carlo and Sampling Methods ] [ PM: Spectral Methods ] [ T: Everything Else ] [ T: Miscellaneous Aspects of Machine Learning ] [ MISC: Everything Else ] [ MISC: Kernel methods ]

Abstract: We propose an input sparsity time sampling algorithm that can spectrally approximate the Gram matrix corresponding to the q-fold column-wise tensor product of q matrices using a nearly optimal number of samples, improving upon all previously known methods by poly(q) factors. Furthermore, for the important special case of the q-fold self-tensoring of a dataset, which is the feature matrix of the degree-q polynomial kernel, the leading term of our method’s runtime is proportional to the size of the dataset and has no dependence on q. Previous techniques either incur a poly(q) factor slowdown in their runtime or remove the dependence on q at the expense of having sub-optimal target dimension, and depend quadratically on the number of data-points in their runtime. Our sampling technique relies on a collection of q partially correlated random projections which can be simultaneously applied to a dataset X in total time that only depends on the size of X, and at the same time their q-fold Kronecker product acts as a near-isometry for any fixed vector in the column span of $X^{\otimes q}$. We also show that our sampling methods generalize to other classes of kernels beyond polynomial, such as Gaussian and Neural Tangent kernels.

Chat is not available.