Timezone: »

 
Kernel Thinning
Lester Mackey · Raaz Dwivedi
We introduce kernel thinning, a new procedure for compressing a distribution $\mathbb{P}$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $\mathbf{k}$ and $\mathcal{O}(n^2)$ time, kernel thinning compresses an $n$-point approximation to $\mathbb{P}$ into a $\sqrt{n}$-point approximation with comparable worst-case integration error across the associated reproducing kernel Hilbert space. With high probability, the maximum discrepancy in integration error is $\mathcal{O}_d(n^{-\frac{1}{2}}\sqrt{\log n})$ for compactly supported $\mathbb{P}$ and $\mathcal{O}_d(n^{-\frac{1}{2}} \sqrt{(\log n)^{d+1}\log\log n})$ for sub-exponential $\mathbb{P}$ on $\mathbb{R}^d$. In contrast, an equal-sized i.i.d. sample from $\mathbb{P}$ suffers $\Omega(n^{-\frac14})$ integration error. Our sub-exponential guarantees resemble the classical quasi-Monte Carlo error rates for uniform $\mathbb{P}$ on $[0,1]^d$ but apply to general distributions on $\mathbb{R}^d$ and a wide range of common kernels. We use our results to derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Mat\'ern, and B-spline kernels and present two vignettes illustrating the practical benefits of kernel thinning over i.i.d. sampling and standard Markov chain Monte Carlo thinning.

Author Information

Lester Mackey (Microsoft Research)
Lester Mackey

Lester Mackey is a machine learning researcher at Microsoft Research, where he develops new tools, models, and theory for large-scale learning tasks driven by applications from healthcare, climate, recommender systems, and the social good. Lester moved to Microsoft from Stanford University, where he was an assistant professor of Statistics and (by courtesy) of Computer Science. He earned his PhD in Computer Science and MA in Statistics from UC Berkeley and his BSE in Computer Science from Princeton University. He co-organized the second place team in the \$1M. Netflix Prize competition for collaborative filtering, won the \$50K Prise4Life ALS disease progression prediction challenge, won prizes for temperature and precipitation forecasting in the yearlong real-time \$800K Subseasonal Climate Forecast Rodeo, and received a best student paper award at the International Conference on Machine Learning.

Raaz Dwivedi (UNIVERSITY OF CALIFORNIA Berkeley)

More from the Same Authors

  • 2021 : Kernel Thinning »
    Raaz Dwivedi · Lester Mackey
  • 2021 Poster: Online Learning with Optimism and Delay »
    Genevieve Flaspohler · Francesco Orabona · Judah Cohen · Soukayna Mouatadid · Miruna Oprescu · Paulo Orenstein · Lester Mackey
  • 2021 Spotlight: Online Learning with Optimism and Delay »
    Genevieve Flaspohler · Francesco Orabona · Judah Cohen · Soukayna Mouatadid · Miruna Oprescu · Paulo Orenstein · Lester Mackey
  • 2020 Poster: Single Point Transductive Prediction »
    Nilesh Tripuraneni · Lester Mackey
  • 2020 Invited Talk: Doing Some Good with Machine Learning »
    Lester Mackey
  • 2019 Workshop: AI For Social Good (AISG) »
    Margaux Luck · Kris Sankaran · Tristan Sylvain · Sean McGregor · Jonnie Penn · Girmaw Abebe Tadesse · Virgile Sylvain · Myriam Côté · Lester Mackey · Rayid Ghani · Yoshua Bengio
  • 2019 : Networking Lunch (provided) + Poster Session »
    Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki
  • 2017 Poster: Improving Gibbs Sampler Scan Quality with DoGS »
    Ioannis Mitliagkas · Lester Mackey
  • 2017 Talk: Improving Gibbs Sampler Scan Quality with DoGS »
    Ioannis Mitliagkas · Lester Mackey
  • 2017 Poster: Measuring Sample Quality with Kernels »
    Jackson Gorham · Lester Mackey
  • 2017 Talk: Measuring Sample Quality with Kernels »
    Jackson Gorham · Lester Mackey