Timezone: »

 
Michal Valko: How Negative Dependence Broke the Quadratic Barrier for Learning with Graphs and Kernels
Michal Valko

Fri Jun 14 02:00 PM -- 02:40 PM (PDT) @

As we advance with resources, we move from reasoning on entities to reasoning on pairs and groups. We have beautiful frameworks: graphs, kernels, DPPs. However, the methods that work with pairs, relationships, and similarity are just slow. Kernel regression, or second-order gradient methods, or sampling from DPPs do not scale to large data, because of the costly construction and storing of matrix Kn. Prior work showed that sampling points according to their ridge leverage scores (RLS) generates small dictionaries with strong spectral approximation guarantees for Kn. However, computing exact RLS requires building and storing the whole kernel matrix. In this talk, we start with SQUEAK, a new online approach for kernel approximations based on RLS sampling that sequentially processes the data, storing a dictionary with a number of points that only depends on the effective dimension deff(gamma) of the dataset. The beauty of negative dependence, that we estimate on the fly, makes it possible to exclude huge portions of dictionary. With the small dictionary, SQUEAK never constructs the whole matrix Kn, runs in linear time O(nd_eff(gamma)^3) w.r.t. n, and requires only a single pass over the dataset. A distributed version of SQUEAK runs in as little as O(log(n)d_eff(gamma)^3) time. This online tool opens out a range of possibilities to finally have scalable, adaptive, and provably accurate kernel methods: semi-supervised learning or Laplacian smoothing on large graphs, scalable GP-UCB, efficient second-order kernel online learning, and even fast DPP sampling, some of these being featured in this workshop.

Author Information

Michal Valko (DeepMind)

Michal is a research scientist in DeepMind Paris and SequeL team at Inria Lille - Nord Europe, France, lead by Philippe Preux and Rémi Munos. He also teaches the course Graphs in Machine Learning at l'ENS Cachan. Michal is primarily interested in designing algorithms that would require as little human supervision as possible. This means 1) reducing the “intelligence” that humans need to input into the system and 2) minimising the data that humans need spend inspecting, classifying, or “tuning” the algorithms. Another important feature of machine learning algorithms should be the ability to adapt to changing environments. That is why he is working in domains that are able to deal with minimal feedback, such as bandit algorithms, semi-supervised learning, and anomaly detection. Most recently he has worked on sequential algorithms with structured decisions where exploiting the structure can lead to provably faster learning. In the past the common thread of Michal's work has been adaptive graph-based learning and its application to the real world applications such as recommender systems, medical error detection, and face recognition. His industrial collaborators include Adobe, Intel, Technicolor, and Microsoft Research. He received his PhD in 2011 from University of Pittsburgh under the supervision of Miloš Hauskrecht and after was a postdoc of Rémi Munos.

More from the Same Authors