ICML Poster Rates of Convergence for Sparse Variational Gaussian Process Regression

Poster

Rates of Convergence for Sparse Variational Gaussian Process Regression

David Burt · Carl E Rasmussen · Mark van der Wilk

Pacific Ballroom #270

Keywords: [ Approximate Inference ] [ Bayesian Nonparametrics ] [ Gaussian Processes ]

Outstanding Paper

[ Abstract ]

Abstract: Excellent variational approximations to Gaussian process posteriors have been developed which avoid the

O (N^{3})

$\mathcal{O}\left(N^3\right)$ scaling with dataset size

N

$N$ . They reduce the computational cost to

O (N M^{2})

$\mathcal{O}\left(NM^2\right)$ , with

M ≪ N

$M\ll N$ the number of \emph{inducing variables}, which summarise the process. While the computational cost seems to be linear in

N

$N$ , the true complexity of the algorithm depends on how

M

$M$ must increase to ensure a certain quality of approximation. We show that with high probability the KL divergence can be made arbitrarily small by growing

M

$M$ more slowly than

N

$N$ . A particular case is that for regression with normally distributed inputs in D-dimensions with the Squared Exponential kernel,

M = O (\log^{D} N)

$M=\mathcal{O}(\log^D N)$ suffices. Our results show that as datasets grow, Gaussian process posteriors can be approximated cheaply, and provide a concrete rule for how to increase

M

$M$ in continual learning scenarios.

Live content is unavailable. Log in and register to view live content