Timezone: »
Poster
Composable Sketches for Functions of Frequencies: Beyond the Worst Case
Edith Cohen · Ofir Geri · Rasmus Pagh
Thu Jul 16 12:00 PM -- 12:45 PM & Thu Jul 16 11:00 PM -- 11:45 PM (PDT) @
Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. However, many common functions, such as thresholds and $p$th frequency moments with $p>2$, are known to require polynomial size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy \emph{advice} on item frequencies. This continues the line of work of Hsu et al. (ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al. (ICALP 2002). Surprisingly, we show analytically and empirically that "in practice" small polylogarithmic-size sketches provide accuracy for "hard" functions.
Author Information
Edith Cohen (Google Research and Tel Aviv University)
Ofir Geri (Stanford University)
Rasmus Pagh (IT University of Copenhagen)
More from the Same Authors
-
2022 Poster: On the Robustness of CountSketch to Adaptive Inputs »
Edith Cohen · Xin Lyu · Jelani Nelson · Tamas Sarlos · Moshe Shechner · Uri Stemmer -
2022 Spotlight: On the Robustness of CountSketch to Adaptive Inputs »
Edith Cohen · Xin Lyu · Jelani Nelson · Tamas Sarlos · Moshe Shechner · Uri Stemmer -
2022 Poster: FriendlyCore: Practical Differentially Private Aggregation »
Eliad Tsfadia · Edith Cohen · Haim Kaplan · Yishay Mansour · Uri Stemmer -
2022 Spotlight: FriendlyCore: Practical Differentially Private Aggregation »
Eliad Tsfadia · Edith Cohen · Haim Kaplan · Yishay Mansour · Uri Stemmer -
2021 Poster: Differentially-Private Clustering of Easy Instances »
Edith Cohen · Haim Kaplan · Yishay Mansour · Uri Stemmer · Eliad Tsfadia -
2021 Spotlight: Differentially-Private Clustering of Easy Instances »
Edith Cohen · Haim Kaplan · Yishay Mansour · Uri Stemmer · Eliad Tsfadia -
2020 Poster: Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead »
Badih Ghazi · Ravi Kumar · Pasin Manurangsi · Rasmus Pagh -
2019 Poster: Self-similar Epochs: Value in arrangement »
Eliav Buchnik · Edith Cohen · Avinatan Hasidim · Yossi Matias -
2019 Oral: Self-similar Epochs: Value in arrangement »
Eliav Buchnik · Edith Cohen · Avinatan Hasidim · Yossi Matias