Timezone: »

The Third Workshop On Tractable Probabilistic Modeling (TPM)
Pedro Domingos · Daniel Lowd · Tahrima Rahman · Antonio Vergari · Alejandro Molina · Antonio Vergari

Fri Jun 14 08:30 AM -- 06:00 PM (PDT) @ 202
Event URL: https://sites.google.com/view/icmltpm2019/home »

Probabilistic modeling has become the de facto framework to reason about uncertainty in Machine Learning and AI. One of the main challenges in probabilistic modeling is the trade-off between the expressivity of the models and the complexity of performing various types of inference, as well as learning them from data.

This inherent trade-off is clearly visible in powerful -- but intractable -- models like Markov random fields, (restricted) Boltzmann machines, (hierarchical) Dirichlet processes and Variational Autoencoders. Despite these models’ recent successes, performing inference on them resorts to approximate routines. Moreover, learning such models from data is generally harder as inference is a sub-routine of learning, requiring simplifying assumptions or further approximations. Having guarantees on tractability at inference and learning time is then a highly desired property in many real-world scenarios.

Tractable probabilistic modeling (TPM) concerns methods guaranteeing exactly this: performing exact (or tractably approximate) inference and/or learning. To achieve this, the following approaches have been proposed: i) low or bounded-treewidth probabilistic graphical models and determinantal point processes, that exchange expressiveness for efficiency; ii) graphical models with high girth or weak potentials, that provide bounds on the performance of approximate inference methods; and iii) exchangeable probabilistic models that exploit symmetries to reduce inference complexity. More recently, models compiling inference routines into efficient computational graphs such as arithmetic circuits, sum-product networks, cutset networks and probabilistic sentential decision diagrams have advanced the state-of-the-art inference performance by exploiting context-specific independence, determinism or by exploiting latent variables. TPMs have been successfully used in numerous real-world applications: image classification, completion and generation, scene understanding, activity recognition, language and speech modeling, bioinformatics, collaborative filtering, verification and diagnosis of physical systems.

The aim of this workshop is to bring together researchers working on the different fronts of tractable probabilistic modeling, highlighting recent trends and open challenges. At the same time, we want to foster the discussion across similar or complementary sub-fields in the broader probabilistic modeling community. In particular, the rising field of neural probabilistic models, such as normalizing flows and autoregressive models that achieve impressive results in generative modeling. It is an interesting open challenge for the TPM community to keep a broad range of inference routines tractable while leveraging these models’ expressiveness. Furthermore, the rising field of probabilistic programming promises to be the new lingua franca of model-based learning. This offers the TPM community opportunities to push the expressiveness of the models used for general-purpose universal probabilistic languages, such as Pyro, while maintaining efficiency.

We want to promote discussions and advance the field both by having high quality contributed works, as well as high level invited speakers coming from the aforementioned tangent sub-fields of probabilistic modeling.

Fri 9:00 a.m. - 9:10 a.m.
Welcome (Talk)
Fri 9:10 a.m. - 9:50 a.m.
[ Video

I will discuss Testing Arithmetic Circuits (TACs), which are new tractable probabilistic models that are universal function approximators like neural networks. A TAC represents a piecewise multilinear function and computes a marginal query on the newly introduced Testing Bayesian Network (TBN). The structure of a TAC is automatically compiled from a Bayesian network and its parameters are learned from labeled data using gradient descent. TACs can incorporate background knowledge that is encoded in the Bayesian network, whether conditional independence or domain constraints. Hence, the behavior of a TAC comes with some guarantees that are invariant to how it is trained from data. Moreover, a TAC is amenable to being interpretable since its nodes and parameters have precise meanings by virtue of being compiled from a Bayesian network. This recent work aims to fuse models (Bayesian networks) and functions (DNNs) with the goal of realizing their collective benefits.

Adnan Darwiche
Fri 9:50 a.m. - 10:30 a.m.
Poster spotlights (Spotlights)
Fri 10:30 a.m. - 11:00 a.m.
Coffee Break (Break)
Fri 11:00 a.m. - 11:40 a.m.
[ Video

"An important component of human problem-solving expertise is the ability to use knowledge about solving easy problems to guide the solution of difficult ones.” - Minsky A longstanding intuition in AI is that intelligent agents should be able to use solutions to easy problems to solve hard problems. This has often been termed the “tractable island paradigm.” How do we act on this intuition in the domain of probabilistic reasoning? This talk will describe the status of probabilistic reasoning algorithms that are driven by the tractable islands paradigm when solving optimization, likelihood and mixed (max-sum-product, e.g. marginal map) queries. I will show how heuristics generated via variational relaxation into tractable structures, can guide heuristic search and Monte-Carlo sampling, yielding anytime solvers that produce approximations with confidence bounds that improve with time, and become exact if enough time is allowed.

Rina Dechter
Fri 11:40 a.m. - 12:00 p.m.
Poster spotlights (Spotlights)
Fri 12:00 p.m. - 12:40 p.m.
[ Video

Sum-product networks (SPNs) are a prominent class of tractable probabilistic model, facilitating efficient marginalization, conditioning, and other inference routines. However, despite these attractive properties, SPNs have received rather little attention in the (probabilistic) deep learning community, which rather focuses on intractable models such as generative adversarial networks, variational autoencoders, normalizing flows, and autoregressive density estimators. In this talk, I discuss several recent endeavors which demonstrate that i) SPNs can be effectively used as deep learning models, and ii) that hybrid learning approaches utilizing SPNs and other deep learning models are in fact sensible and beneficial.

Robert Peharz
Fri 12:40 p.m. - 2:20 p.m.
Lunch (Break)
Fri 2:20 p.m. - 3:00 p.m.
[ Video

A wide class of machine learning algorithms can be reduced to variable elimination on factor graphs. While factor graphs provide a unifying notation for these algorithms, they do not provide a compact way to express repeated structure when compared to plate diagrams for directed graphical models. In this talk, I will describe a generalization of undirected factor graphs to plated factor graphs, and a corresponding generalization of the variable elimination algorithm that exploits efficient tensor algebra in graphs with plates of variables. This tensor variable elimination algorithm has been integrated into the Pyro probabilistic programming language, enabling scalable, automated exact inference in a wide variety of deep generative models with repeated discrete latent structure. I will discuss applications of such models to polyphonic music modeling, animal movement modeling, and unsupervised word-level sentiment analysis, as well as algorithmic applications to exact subcomputations in approximate inference and ongoing work on extensions to continuous latent variables.

Eli Bingham
Fri 3:00 p.m. - 3:30 p.m.
Coffee Break (Break)
Fri 3:30 p.m. - 4:10 p.m.
[ Video

In this talk, I will discuss how state-of-the-art discriminative deep networks can be turned into likelihood-based density models. Further, I will discuss how such models give rise to an alternative viewpoint on adversarial examples. Under this viewpoint adversarial examples are a consequence of excessive invariances learned by the classifier, manifesting themselves in striking failures when evaluating the model on out of distribution inputs. I will discuss how the commonly used cross-entropy objective encourages such overly invariant representations. Finally, I will present an extension to cross-entropy that, by exploiting properties of invertible deep networks, enables control of erroneous invariances in theory and practice.

Jörn Jacobsen
Fri 4:10 p.m. - 6:30 p.m.
Poster session (Posters)

Author Information

Pedro Domingos (University of Washington)
Daniel Lowd (University of Oregon)
Tahrima Rahman (University of Texas at Dallas)
Antonio Vergari (Max-Planck Institute for Intelligent Systems)
Alejandro Molina (TU Darmstadt)
Antonio Vergari (University of California, Los Angeles)

More from the Same Authors