Tutorials | ICML Lille

Tutorials

The tutorials will take place on July 6, 2015.

ICML 2015 will present 6 invited tutorials. Note that these tutorials will take place in 3 sessions (1 in the morning and 2 in the afternoon). During each session, 2 tutorials will be running in parallel.

Advances in Structured Prediction

Hal Daumé III (University of Maryland) and John Langford (Microsoft Research).
Download slides.

Structured prediction is the problem of making a joint set of decisions to optimize a joint loss. There are two families of algorithms for such problems: Graphical model approaches and learning to search approaches. Graphical models include Conditional Random Fields and Structured SVMs and are effective when writing down a graphical model and solving it is easy. Learning to search approaches, explicitly predict the joint set of decisions incrementally, conditioning on past and future decisions. Such models may be particularly useful when the dependencies between the predictions are complex, the loss is complex, or the construction of an explicit graphical model is impossible.

We will describe both approaches, with a deeper focus on the latter learning-to-search paradigm, which has less tutorial support. This paradigm has been gaining increasing traction over the past five years, making advances in natural language processing (dependency parsing, semantic parsing), robotics (grasping and path planning), social network analysis and computer vision (object segmentation).

Bayesian Time Series Modeling: Structured Representations for Scalability

Emily Fox (University of Washington).
Download slides.

Time series of increasing complexity are being collected in a variety of fields ranging from neuroscience, genomics, and environmental monitoring to e-commerce based on technologies and infrastructures previously unavailable. These datasets can be viewed either as providing a single, high-dimensional time series or as a massive collection of time series with intricate and possibly evolving relationships between them. For scalability, it is crucial to discover and exploit sparse dependencies between the data streams or dimensions. Such representational structures for independent data sources have been extensively explored in the machine learning community. However, in the conversation on big data, despite the importance and prevalence of time series, the question of how to analyze such data at scale has received limited attention and represents an area of research opportunities.

For these time series of interest, there are two key modeling components: the dynamic and relational models, and their interplay. In this tutorial, we will review some foundational time series models, including the hidden Markov model (HMM) and vector autoregressive (VAR) process. Such dynamical models and their extensions have proven useful in capturing complex dynamics of individual data streams such as human motion, speech, EEG recordings, and genome sequences. However, a focus of this tutorial will be on how to deploy scalable representational structures for capturing sparse dependencies between data streams. In particular, we consider clustering, directed and undirected graphical models, and low-dimensional embeddings in the context of time series. An emphasis is on learning such structure from the data. We will also provide some insights into new computational methods for performing efficient inference in large-scale time series.

Throughout the tutorial we will highlight Bayesian and Bayesian nonparametric approaches for learning and inference. Bayesian methods provide an attractive framework for examining complex data streams by naturally incorporating and propagating notions of uncertainty and enabling integration of heterogenous data sources; the Bayesian nonparametric aspect allows the complexity of the dynamics and relational structure to adapt to the observed data.

Natural Language Understanding: Foundations and State-of-the-Art

Percy Liang (Stanford University).
Download slides.

Building systems that can understand human language—being able to answer questions, follow instructions, carry on dialogues—has been a long-standing challenge since the early days of AI. Due to recent advances in machine learning, there is again renewed interest in taking on this formidable task. A major question is how one represents and learns the semantics (meaning) of natural language, to which there are only partial answers. The goal of this tutorial is (i) to describe the linguistic and statistical challenges that any system must address; and (ii) to describe the types of cutting edge approaches and the remaining open problems. Topics include distributional semantics (e.g., word vectors), frame semantics (e.g., semantic role labeling), model-theoretic semantics (e.g., semantic parsing), the role of context, grounding, neural networks, latent variables, and inference. The hope is that this unified presentation will clarify the landscape, and show that this is an exciting time for the machine learning community to engage in the problems in natural language understanding.

Policy Search: Methods and Applications

Gerhard Neumann (Technische Universität Darmstadt) and Jan Peters (Technische Universität Darmstadt & Max Planck Institute for Intelligent Systems, Tübingen).
Download slides.

Policy search is a subfield in reinforcement learning which focuses on finding good parameters for a given policy parametrization. It is well suited for robotics as it can cope with high-dimensional state and action spaces, one of the main challenges in robot learning. We review recent successes of both model-free and model-based policy search in robot learning.

Model-free policy search is a general approach to learn policies based on sampled trajectories. We classify model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and present a unified view on existing algorithms. Learning a policy is often easier than learning an accurate forward model, and, hence, model-free methods are more frequently used in practice. How- ever, for each sampled trajectory, it is necessary to interact with the robot, which can be time consuming and challenging in practice. Model-based policy search addresses this problem by first learning a simulator of the robot’s dynamics from data. Subsequently, the simulator generates trajectories that are used for policy learning. For both model- free and model-based policy search methods, we review their respective properties and their applicability to robotic systems.

Modern Convex Optimization Methods for Large-scale Empirical Risk Minimization

Peter Richtárik (University of Edimburgh) and Mark Schmidt (University of British Columbia).
Download slides part 1 – part 2.

This tutorial reviews recent advances in convex optimization for training (linear) predictors via (regularized) empirical risk minimization. We exclusively focus on practically efficient methods which are also equipped with complexity bounds confirming the suitability of the algorithms for solving huge-dimensional problems (a very large number of examples or a very large number of features).

The first part of the tutorial is dedicated to modern primal methods (belonging to the stochastic gradient descent variety), while the second part focuses on modern dual methods (belonging to the randomized coordinate ascent variety). While we make this distinction, there are very close links between the primal and dual methods, some of which will be highlighted. We shall also comment on mini-batch, parallel and distributed variants of the methods as this is an important consideration for applications involving big data.

Computational Social Science

Hanna Wallach (Microsoft Research & University of Massachusetts Amherst)

From interactions between friends, colleagues, or political leaders to the activities of corporate or government organizations, complex social processes underlie almost all human endeavor. The emerging field of computational social science is concerned with the development of new mathematical models and computational tools for understanding and reasoning about such processes from noisy, missing, or uncertain information. Computational social science is an inherently interdisciplinary area, situated at the intersection of computer science, statistics, and the social sciences, with researchers from traditionally disparate backgrounds working together to answer questions arising in sociology, political science, economics, public policy, journalism, and beyond.

In the first half of this tutorial, I will provide an overview of computational social science, emphasizing recent research that moves beyond the study of small-scale, static snapshots of networks, and onto nuanced, data-driven analyses of the structure, content, and dynamics of large-scale social processes. I will focus on commonalities of these social processes, as well as differences between the types of modeling tasks typically prioritized by computer scientists and social scientists. I will then discuss Bayesian latent variable modeling as a methodological framework for understanding and reasoning about complex social processes, and provide a brief overview of Bayesian inference.

In the second half of this tutorial, I will concentrate specifically on political science. I will discuss data sources, acquisition methods, and research questions, as well as the mathematical details of several models recently developed by the political methodology community. These models, which draw upon research in machine learning and natural language processing, not only serve as examples of the outstanding methodological work being done in the social sciences, but also demonstrate how ideas originally developed by computer scientists can be adapted and used to answer substantive questions that further our understanding of society.

International Conference on Machine Learning