Timezone: »
Poster
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling
David Woodruff · Amir Zandieh
To accelerate kernel methods, we propose a near input sparsity time method for sampling the high-dimensional space implicitly defined by a kernel transformation. Our main contribution is an importance sampling method for subsampling the feature space of a degree $q$ tensoring of data points in almost input sparsity time, improving the recent oblivious sketching of (Ahle et al., 2020) by a factor of $q^{5/2}/\epsilon^2$. This leads to a subspace embedding for the polynomial kernel as well as the Gaussian kernel with a target dimension that is only linearly dependent on the statistical dimension of the kernel and in time which is only linearly dependent on the sparsity of the input dataset. We show how our subspace embedding bounds imply new statistical guarantees for kernel ridge regression. Furthermore, we empirically show that in large-scale regression tasks, our algorithm outperforms state-of-the-art kernel approximation methods.
Author Information
David Woodruff (CMU)
Amir Zandieh (EPFL)
More from the Same Authors
-
2018 Poster: Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams »
Ashkan Norouzi-Fard · Jakub Tarnawski · Slobodan Mitrovic · Amir Zandieh · Aidasadat Mousavifar · Ola Svensson -
2018 Oral: Beyond 1/2-Approximation for Submodular Maximization on Massive Data Streams »
Ashkan Norouzi-Fard · Jakub Tarnawski · Slobodan Mitrovic · Amir Zandieh · Aidasadat Mousavifar · Ola Svensson -
2017 Poster: Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees »
Haim Avron · Michael Kapralov · Cameron Musco · Christopher Musco · Ameya Velingker · Amir Zandieh -
2017 Talk: Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees »
Haim Avron · Michael Kapralov · Cameron Musco · Christopher Musco · Ameya Velingker · Amir Zandieh