Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering
Taisuke Yasuda · David Woodruff · Manuel Fernandez

Thu Jun 13th 11:00 -- 11:20 AM @ Room 101

Kernel methods generalize machine learning algorithms that only depend on the pairwise inner products of the dataset by replacing inner products with kernel evaluations, a function that passes input points through a nonlinear feature map before taking the inner product in a higher dimensional space. In this work, we present nearly tight lower bounds on the number of kernel evaluations required to approximately solve kernel ridge regression (KRR) and kernel $k$-means clustering (KKMC) on $n$ input points. For KRR, our bound for relative error approximation the argmin of the objective function is $\Omega(nd{\mathrm{eff}}^\lambda/\varepsilon)$ where $d{\mathrm{eff}}^\lambda$ is the effective statistical dimension, tight up to a $\log(d_{\mathrm{eff}}^\lambda/\varepsilon)$ factor. For KKMC, our bound for finding a $k$-clustering achieving a relative error approximation of the objective function is $\Omega(nk/\varepsilon)$, tight up to a $\log(k/\varepsilon)$ factor. Our KRR result resolves a variant of an open question of El Alaoui and Mahoney, asking whether the effective statistical dimension is a lower bound on the sampling complexity or not. Furthermore, for the important input distribution case of mixtures of Gaussians, we provide algorithms that bypass the above lower bounds.

Author Information

Tai Yasuda (Carnegie Mellon University)
David Woodruff (Carnegie Mellon University)
Manuel Fernandez (Carnegie Mellon University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors