ICML 2023 Sharper Bounds for $\ell_p$ Sensitivity Sampling Oral

Oral

Sharper Bounds for $\ell_p$ Sensitivity Sampling

David Woodruff · Taisuke Yasuda

Meeting Room 316 A-C

[ Abstract ] [ Visit Oral A3 ML Theory ]

[ PDF]

Abstract: In large scale machine learning, *random sampling* is a popular way to approximate datasets by a small representative subset of examples. In particular, *sensitivity sampling* is an intensely studied technique which provides provable guarantees on the quality of approximation, while reducing the number of examples to the product of the *VC dimension*

d

$d$ and the *total sensitivity*

S

$\mathfrak{S}$ in remarkably general settings. However, guarantees going beyond this general bound of

S d

$\mathfrak{S} d$ are known in perhaps only one setting, for *

ℓ_{2}

$\ell_2$ subspace embeddings*, despite intense study of sensitivity sampling in prior work. In this work, we show the first bounds for sensitivity sampling for

ℓ_{p}

$\ell_p$ subspace embeddings for

p \neq 2

$p\neq 2$ that improve over the general

S d

$\mathfrak{S} d$ bound, achieving a bound of roughly

S^{2 / p}

$\mathfrak{S}^{2/p}$ for

1 \leq p 2

$1\leq p2$ and

S^{2 - 2 / p}

$\mathfrak{S}^{2-2/p}$ for

2 < p < \infty

$2<p<\infty$ . For

1 \leq p < 2

$1\leq p<2$ , we show that this bound is tight, in the sense that there exist matrices for which

S^{2 / p}

$\mathfrak{S}^{2/p}$ samples is necessary. Furthermore, our techniques yield further new results in the study of sampling algorithms, showing that the *root leverage score sampling* algorithm achieves a bound of roughly

d

$d$ for

1 \leq p < 2

$1\leq p<2$ , and that a combination of leverage score and sensitivity sampling achieves an improved bound of roughly

d^{2 / p} S^{2 - 4 / p}

$d^{2/p}\mathfrak{S}^{2-4/p}$ for

2 < p < \infty

$2<p<\infty$ . Our sensitivity sampling results yield the best known sample complexity for a wide class of structured matrices that have small

ℓ_{p}

$\ell_p$ sensitivity.

Chat is not available.

Oral

Sharper Bounds for ℓpℓp\ell_p Sensitivity Sampling

David Woodruff · Taisuke Yasuda

Meeting Room 316 A-C

Sharper Bounds for $\ell_p$ Sensitivity Sampling