Timezone: »

Workshop
Graph Representation Learning and Beyond (GRL+)
Petar Veličković · Michael M. Bronstein · Andreea Deac · Will Hamilton · Jessica Hamrick · Milad Hashemi · Stefanie Jegelka · Jure Leskovec · Renjie Liao · Federico Monti · Yizhou Sun · Kevin Swersky · Rex (Zhitao) Ying · Marinka Zitnik

Thu Jul 16 11:40 PM -- 10:00 AM (PDT) @ None

Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of CNNs to graph-structured data, and neural message-passing approaches. These advances in graph neural networks and related techniques have led to new state-of-the-art results in numerous domains: chemical synthesis, 3D-vision, recommender systems, question answering, continuous control, self-driving and social network analysis. Building on the successes of three related workshops from last year (at ICML, ICLR and NeurIPS), the primary goal for this workshop is to facilitate community building, and support expansion of graph representation learning into more interdisciplinary projects with the natural and social sciences. With hundreds of new researchers beginning projects in this area, we hope to bring them together to consolidate this fast-growing area into a healthy and vibrant subfield. Especially, we aim to strongly promote novel and exciting applications of graph representation learning across the sciences, reflected in our choices of invited speakers.

 Thu 11:40 p.m. - 12:00 a.m. Opening Remarks (Introduction) Petar Veličković · Andreea-Ioana Deac 🔗 Fri 12:00 a.m. - 12:30 a.m. Invited Talk: Xavier Bresson (Talk) Xavier Bresson 🔗 Fri 12:30 a.m. - 1:00 a.m. Invited Talk: Thomas Kipf (Talk) Thomas Kipf 🔗 Fri 1:00 a.m. - 1:30 a.m. Q&A / Discussions / Coffee 1 (Discussion) 🔗 Fri 1:30 a.m. - 2:30 a.m. Virtual Poster Session #1 (Poster Session) 🔗 Fri 2:30 a.m. - 2:40 a.m. Novel Applications: Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks (Spotlight)    We present Wiki-CS, a novel dataset derived from Wikipedia for benchmarking Graph Neural Networks. The dataset consists of nodes corresponding to Computer Science articles, with edges based on hyperlinks and 10 classes representing different branches of the field. We use the dataset to evaluate semi-supervised node classification and single-relation link prediction models. Our experiments show that these methods perform well on a new domain, with structural properties different from earlier benchmarks. We will make the dataset publicly available for future work. Teaser video | [ protected link dropped ] Péter Mernyei 🔗 Fri 2:40 a.m. - 2:50 a.m. Novel Applications: Graph Neural Networks for Massive MIMO Detection (Spotlight)    In this paper, we innovately use graph neural networks (GNNs) to learn a message-passing solution for the inference task of massive multiple multiple-input multiple-output (MIMO) detection in wireless communication. We adopt a graphical model based on the Markov random field (MRF) where belief propagation (BP) yields poor results when it assumes a uniform prior over the transmitted symbols. Numerical simulations show that, under the uniform prior assumption, our GNN-based MIMO detection solution outperforms the minimum mean-squared error (MMSE) baseline detector, in contrast to BP. Furthermore, experiments demonstrate that the performance of the algorithm slightly improves by incorporating MMSE information into the prior. Teaser video | [ protected link dropped ] Andrea Scotti 🔗 Fri 2:50 a.m. - 3:00 a.m. Novel Applications: Embedding a random graph via GNN: Extended mean-field inference theory and RL applications to NP-Hard multi-robot/machine scheduling (Spotlight)    We illustrate that developing a theory of ‘how to embed a random graph using GNN’ is the key to achieving the first near-optimal learning-based scheduling algorithm for an NP-hard multi-robot scheduling problem for tasks with time-varying rewards. We focus on a problem referred to as a Multi-Robot Reward Collection (MRRC) problem, of which immediate applications are ridesharing and pickup-and-delivery problems. We 1) observe that states in our robot scheduling problems can be represented as an extension of probabilistic graphical models (PGMs), which we refer to as random PGMs, and 2) develop a meanfield inference method for random PGMs. We then prove that a simple heuristic for applying deep graph encoder for random graph embedding is theoretically justified. We illustrate how a two-step hierarchical inference induces precise Qfunction estimation. We empirically demonstrate that our method achieves near-optimality (plus transferability and scalability, machine scheduling (IPMS) applications in the appendix section). Arxiv preprint: https://arxiv.org/abs/1905.12204. Teaser video | [ protected link dropped ] Hyunwook Kang 🔗 Fri 3:00 a.m. - 3:10 a.m. Update: PyTorch Geometric (Demonstration) Matthias Fey 🔗 Fri 3:10 a.m. - 3:20 a.m. Update: Deep Graph Library (Demonstration) George Karypis 🔗 Fri 3:20 a.m. - 3:30 a.m. Update: Open Graph Benchmark (Demonstration) Jure Leskovec 🔗 Fri 3:30 a.m. - 4:30 a.m. Lunch Break (Break) 🔗 Fri 4:30 a.m. - 4:45 a.m. Original Research: Get Rid of Suspended Animation: Deep Diffusive Neural Network for Graph Representation Learning (Spotlight)    Existing graph neural networks may suffer from the “suspended animation problem” when the model architecture goes deep. Meanwhile, for some graph learning scenarios, e.g., nodes with text/image attributes or graphs with long- distance node correlations, deep graph neural networks will be necessary for effective graph representation learning. In this paper, we propose a new graph neural network, namely DIFNET (Graph Diffusive Neural Network), for deep graph representation learning and node classification. DIFNET utilizes both neural gates and graph residual learning for node hidden state modeling, and includes an attention mechanism for node neighborhood information diffusion. Extensive experimental results can illustrate both the learning performance advantages of DIFNET compared with existing methods, especially in addressing the “suspended animation problem”. Teaser video | [ protected link dropped ] Jiawei Zhang 🔗 Fri 4:45 a.m. - 5:00 a.m. Original Research: Learning Graph Models for Template-Free Retrosynthesis (Spotlight)    Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule. Despite advancements in neural retrosynthesis algorithms, they are unable to fully recapitulate the strategies employed by chemists and do not generalize well to infrequent reactions. In this paper, we propose a graph-based approach that capitalizes on the idea that graph topology of precursor molecules is largely unaltered during the reaction. The model first predicts the set of graph edits transforming the target into incomplete molecules called synthons. Next, the model learns to expand synthons into complete molecules by attaching relevant leaving groups. Since the model operates at the level of molecular fragments, it avoids full generation, greatly simplifying the underlying architecture and improving its ability to generalize. The model yields 11.7% absolute improvement over state-of-the-art approaches on the USPTO-50k dataset. Teaser video | [ protected link dropped ] Vignesh Ram Somnath 🔗 Fri 5:00 a.m. - 5:15 a.m. Original Research: Frequent Subgraph Mining by Walking in Order Embedding Space (Spotlight)    Identifying frequent subgraphs or network motifs has been crucial in analyzing and predicting properties of real-world networks. However, finding large commonly-occurring motifs remains an open problem due to its NP-hard subroutine of subgraph counting, and the combinatorial growth of the number of possible subgraphs with their size. Here we present Subgraph Pattern Miner (SPMiner), a novel neural approach to finding frequent subgraphs in a large target graph. SPMiner integrates graph neural networks, order embedding space, and an efficient search strategy to identify network subgraph patterns that appear most frequently in a target graph dataset. SPMiner first decomposes the target graph into many overlapping subgraphs and then encodes the subgraphs into order embeddings. SPMiner then uses a monotonic walk in the order embedding space to identify frequent motifs. Compared to existing approaches and possible neural alternatives, SPMiner is more accurate, faster, and more scalable. For 5- and 6-node motifs, we show that SPMiner can identify almost all of the most frequent motifs while being 100x faster than exact enumeration methods. In addition, SPMiner can also reliably identify frequent 10-node motifs, which is well beyond the size limit of exact enumeration approaches. And last, We show that SPMiner can find large 10+ node motifs that appear 10-100x more frequently than those found by current widely-used approximate methods. Teaser video | [ protected link dropped ] Rex (Zhitao) Ying 🔗 Fri 5:15 a.m. - 5:45 a.m. Invited Talk: Kyle Cranmer (Talk) Kyle Cranmer 🔗 Fri 5:45 a.m. - 6:15 a.m. Invited Talk: Danai Koutra (Talk) Danai Koutra 🔗 Fri 6:15 a.m. - 6:45 a.m. Q&A / Discussions / Coffee 2 (Discussion) 🔗 Fri 6:45 a.m. - 6:55 a.m. COVID-19 Applications: Navigating the Dynamics of Financial Embeddings over Time (Spotlight)    Financial transactions constitute connections between entities and through these connections a large scale heterogeneous weighted graph is formulated. In this labyrinth of interactions that are continuously updated, there exists a variety of similarity-based patterns that can provide insights into the dynamics of the financial system. With the current work, we propose the application of Graph Representation Learning in a scalable dynamic setting as a means of capturing these patterns in a meaningful and robust way. We proceed to perform a rigorous qualitative analysis of the latent trajectories to extract real world insights from the proposed representations and their evolution over time that is to our knowledge the first of its kind in the financial sector. Shifts in the latent space are associated with known economic events and in particular the impact of the recent Covid-19 pandemic to consumer patterns. Capturing such patterns indicates the value added to financial modeling through the incorporation of latent graph representations. Teaser video | [ protected link dropped ] Antonia Gogoglou 🔗 Fri 6:55 a.m. - 7:05 a.m. COVID-19 Applications: Integrating Logical Rules Into Neural Multi-Hop Reasoning for Drug Repurposing (Spotlight)    The graph structure of biomedical data differs from those in typical knowledge graph benchmarking tasks. A particular property of biomedical data is the presence of long-range dependencies, which can be captured by patterns described as logical rules. We propose a novel method that combines these rules with a neural multi-hop reasoning approach that uses reinforcement learning. We conduct an empirical study based on the real-world task of drug repurposing by formulating this task as a link prediction problem. We apply our method to the biomedical knowledge graph Hetionet and show that our approach outperforms several baseline methods. Teaser video | [ protected link dropped ] Yushan Liu 🔗 Fri 7:05 a.m. - 7:15 a.m. COVID-19 Applications: Gaining insight into SARS-CoV-2 infection and COVID-19 severity using self-supervised edge features and Graph Neural Networks (Spotlight)    Graph Neural Networks (GNN) have been extensively used to extract meaningful representations from graph structured data and to perform predictive tasks such as node classification and link prediction. In recent years, there has been a lot of work incorporating edge features along with node features for prediction tasks. In this work, we present a framework for creating new edge features, via a combination of self-supervised and unsupervised learning which we then use along with node features for node classification tasks. We validate our work on two biological datasets comprising of single-cell RNA sequencing data of in vitro SARS-CoV-2 infection and human COVID-19 patients. We demonstrate that our method achieves better performance over baseline Graph Attention Network (GAT) and Graph Convolutional Network (GCN) models. Furthermore, given the attention mechanism on edge and node features, we are able to interpret the cell types and genes that determine the course and severity of COVID-19, contributing to a growing list of potential disease biomarkers and therapeutic targets. Teaser video | [ protected link dropped ] Arijit Sehanobish 🔗 Fri 7:15 a.m. - 7:45 a.m. Invited Talk: Tina Eliassi-Rad (Talk) Tina Eliassi-Rad 🔗 Fri 7:45 a.m. - 8:15 a.m. Invited Talk: Raquel Urtasun (Talk) Raquel Urtasun 🔗 Fri 8:15 a.m. - 8:45 a.m. Q&A / Discussions / Coffee 3 (Discussion) 🔗 Fri 8:45 a.m. - 9:45 a.m. Virtual Poster Session #2 (Poster Session) 🔗 Fri 9:45 a.m. - 10:00 a.m. Closing Remarks (Conclusions) Petar Veličković 🔗 - (#2 / Sess. 1) When Spectral Domain Meets Spatial Domain in Graph Neural Networks (Poster Teaser)    Convolutional Graph Neural Networks (ConvGNNs) are designed either in the spectral domain or in the spatial domain. In this paper, we provide a theoretical framework to analyze these neural networks, by deriving some equivalence of the graph convolution processes, regardless if they are designed in the spatial or the spectral domain. We demonstrate the relevance of the proposed framework by providing a spectral analysis of the most popular ConvGNNs (ChebNet, CayleyNet, GCN and Graph Attention Networks), which allows to explain their performance and shows their limits. Teaser video | [ protected link dropped ] Muhammet Balcilar 🔗 - (#3 / Sess. 2) Spectral-designed Depthwise Separable Graph Neural Networks (Poster Teaser)    This paper aims at revisiting Convolutional Graph Neural Networks (ConvGNNs) by designing new graph convolutions in spectral domain with a custom frequency profile while applying them in the spatial domain. Within the proposed framework, we propose two ConvGNNs methods: one using a simple single-convolution kernel that operates as a low-pass filter, and one operating multiple convolution kernels called Depthwise Separable Graph Convolution Network (DSGCN). The latter is a generalization of the depthwise separable convolution framework for graph convolutional networks, which allows to decrease the total number of trainable parameters while keeping the capacity of the model unchanged. Our proposals are evaluated on both transductive and inductive graph learning problems, demonstrating that DSGCN outperforms the state-of-the-art methods. Teaser video | [ protected link dropped ] Muhammet Balcilar 🔗 - (#8 / Sess. 2) Practical Adversarial Attacks on Graph Neural Networks (Poster Teaser)    We study the black-box attacks on graph neural networks (GNNs) under a novel and realistic constraint: attackers have access to only a subset of nodes in the network, and they can only attack a small number of them. A node selection step is essential under this setup. We demonstrate that the structural inductive biases of GNN models can be an effective source for this type of attacks. Specifically, by exploiting the connection between the backward propagation of GNNs and random walks, we show that the common gradient-based white-box attacks can be generalized to the black-box setting via the connection between the gradient and an importance score similar to PageRank. In practice, we find attacks based on this importance score indeed increase the classification loss by a large margin, but they fail to significantly increase the mis-classification rate. Our further analyses suggest that there is a discrepancy between the loss and mis-classification rate, as the latter presents a diminishing-return pattern when the number of attacked nodes increases. Therefore, we propose a greedy procedure to correct the importance score that takes into account of the diminishing-return pattern. Experimental results show that the proposed procedure can significantly increase the mis-classification rate of common GNNs on real-world data without access to model parameters nor predictions. Teaser video | [ protected link dropped ] Shuangrui Ding 🔗 - (#9 / Sess. 1) Graph Neural Networks in TensorFlow and Keras with Spektral (Poster Teaser)    In this paper we present Spektral, an open-source Python library for building graph neural networks with TensorFlow and the Keras application programming interface. Spektral implements a large set of methods for deep learning on graphs, including message-passing and pooling operators, as well as utilities for processing graphs and loading popular benchmark datasets. The purpose of this library is to provide the essential building blocks for creating graph neural networks, focusing on the guiding principles of user-friendliness and quick prototyping on which Keras is based. Spektral is, therefore, suitable for absolute beginners and expert deep learning practitioners alike. In this work, we present an overview of Spektral's features and report the performance of the methods implemented by the library in scenarios of node classification, graph classification, and graph regression. Teaser video | [ protected link dropped ] Daniele Grattarola 🔗 - (#10 / Sess. 2) Multi-Graph Neural Operator for Parametric Partial Differential Equations (Poster Teaser)    Graph neural networks (GNNs) have gained popularity in simulating physical systems and solving partial differential equations (PDEs) since graphs offer a natural way of modeling particle interactions and discretizing the continuum models. However, the graphs constructed for approximating such tasks usually ignore long-range interactions due to unfavorable scaling of the computational complexity with respect to the number of nodes. The errors due to these approximations scale with the discretization of the system, thereby not allowing for generalization under mesh-refinement. Inspired by the classical multipole methods, we propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Our multi-level formulation is equivalent to recursively adding inducing points to the kernel matrix, unifying GNNs with multi-resolution matrix factorization of the kernel. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time. Teaser video | [ protected link dropped ] Zongyi Li 🔗 - (#12 / Sess. 1) Deep Graph Contrastive Representation Learning (Poster Teaser)    Graph representation learning nowadays becomes fundamental in analyzing graph-structured data. Inspired by recent success of contrastive methods, in this paper, we propose a novel framework for unsupervised graph representation learning by leveraging a contrastive objective at the node level. Specifically, we generate two graph views by corruption and learn node representations by maximizing the agreement of node representations in these two views. To provide diverse node contexts for the contrastive objective, we propose a hybrid scheme for generating graph views on both structure and attribute levels. We perform empirical experiments on both transductive and inductive learning tasks using a variety of real-world datasets. Experimental experiments demonstrate that despite its simplicity, our proposed method consistently outperforms existing state-of-the-art methods by large margins. Notably, our method gains about 10% absolute improvements on protein function prediction. Our unsupervised method even surpasses its supervised counterparts on transductive tasks. Teaser video | [ protected link dropped ] Yanqiao Zhu 🔗 - (#15 / Sess. 2) Learning Distributed Representations of Graphs with Geo2DR (Poster Teaser)    We present Geo2DR (Geometric to Distributed Representations), a Python library for unsupervised learning on graph-structured data using discrete substructure patterns and neural language models. It contains efficient implementations of popular graph decomposition algorithms and neural language models in PyTorch which are combined to learn representations using the distributive hypothesis. Furthermore, Geo2DR comes with general data processing and loading methods which can bring substantial speed-up in the training of the neural language models. Through this we provide a unified set of tools and methods to quickly construct systems capable of learning distributed representations of graphs. This is useful for replication of existing methods, modification, or development of novel systems. This paper serves to present the Geo2DR library and perform a comprehensive comparative analysis of existing methods re-implemented using Geo2DR across several widely used graph classification benchmarks. Geo2DR displays a high reproducibility of results in published methods and interoperability with other libraries useful for distributive language modelling. Teaser video | [ protected link dropped ] Paul Scherer 🔗 - (#18 / Sess. 1) Hierarchical Protein Function Prediction with Tail-GNNs (Poster Teaser)    Protein function prediction may be framed as predicting subgraphs (with certain closure properties) of a directed acyclic graph describing the hierarchy of protein functions. Graph neural networks (GNNs), with their built-in inductive bias for relational data, are hence naturally suited for this task. However, in contrast with most GNN applications, the graph is not related to the input, but to the label space. Accordingly, we propose Tail-GNNs, neural networks which naturally compose with the output space of any neural network for multi-task prediction, to provide relationally-reinforced labels. For protein function prediction, we combine a Tail-GNN with a dilated convolutional network which learns representations of the protein sequence, making significant improvement in F_1 score and demonstrating the ability of Tail-GNNs to learn useful representations of labels and exploit them in real-world problem solving. Teaser video | [ protected link dropped ] Stefan Spalević 🔗 - (#19 / Sess. 1) Neural Bipartite Matching (Poster Teaser)    Graph neural networks (GNNs) have found application for learning in the space of algorithms. However, the algorithms chosen by existing research (sorting, Breadth-First search, shortest path finding, etc.) usually align perfectly with a standard GNN architecture. This report describes how neural execution is applied to a complex algorithm, such as finding maximum bipartite matching by reducing it to a flow problem and using Ford-Fulkerson to find the maximum flow. This is achieved via neural execution based only on features generated from a single GNN. The evaluation shows strongly generalising results with the network achieving optimal matching almost 100% of the time. Teaser video | [ protected link dropped ] Dobrik Georgiev 🔗 - (#20 / Sess. 1) Principal Neighbourhood Aggregation for Graph Nets (Poster Teaser)    Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. We extend this theoretical framework to include continuous features---which occur regularly in real-world input domains and within the hidden layers of GNNs---and we demonstrate the requirement for multiple aggregation functions in this context. Accordingly, we propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). Finally, we compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of our model. Teaser video | [ protected link dropped ] Gabriele Corso 🔗 - (#21 / Sess. 2) Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions (Poster Teaser)    We present a systematic investigation using graph neural networks (GNNs) to model organic chemical reactions. To do so, we prepared a dataset collection of four ubiquitous reactions from the organic chemistry literature. We evaluate seven different GNN architectures for classification tasks pertaining to the identification of experimental reagents and conditions. We find that models are able to identify specific graph features that affect reaction conditions and lead to accurate predictions. The results herein show great promise in advancing molecular machine learning. Teaser video | [ protected link dropped ] Michael Maser 🔗 - (#23 / Sess. 2) A Note on Over-Smoothing for Graph Neural Networks (Poster Teaser)    Graph Neural Networks (GNNs) have achieved a lot of success on graph-structured data. However, it is observed that the performance of graph neural networks does not improve as the number of layers increases. This effect, known as over-smoothing, has been analyzed mostly in linear cases. In this paper, we build upon previous results \cite{oono2019graph} to further analyze the over-smoothing effect in the general graph neural network architecture. We show when the weight matrix satisfies the conditions determined by the spectrum of augmented normalized Laplacian, the Dirichlet energy of embeddings will converge to zero, resulting in the loss of discriminative power. Using Dirichlet energy to measure expressiveness" of embedding is conceptually clean; it leads to simpler proofs than \cite{oono2019graph} and can handle more non-linearities. Teaser video | [ protected link dropped ] Chen Cai 🔗 - (#24 / Sess. 2) Degree-Quant: Quantization-Aware Training for Graph Neural Networks (Poster Teaser)    Graph neural networks have demonstrated strong performance modelling non-uniform structured data. However, there exists little research exploring methods to make them more efficient at inference time. In this work, we explore the viability of training quantized GNNs models, enabling the usage of low precision integer arithmetic for inference. We propose a method, Degree-Quant, to improve performance over existing quantization-aware training baselines commonly used on other architectures, such as CNNs. Our work demonstrates that it is possible to train models using 8-bit integer arithmetic at inference-time with similar accuracy to their full precision counterparts. Teaser video | [ protected link dropped ] Shyam Tailor 🔗 - (#26 / Sess. 2) Message Passing Query Embedding (Poster Teaser)    Recent works on representation learning for Knowledge Graphs have moved beyond the problem of link prediction, to answering queries of an arbitrary structure. Existing methods are based on ad-hoc mechanisms that require training with a diverse set of query structures. We propose a more general architecture that employs a graph neural network to encode a graph representation of the query, where nodes correspond to entities and variables. The generality of our method allows it to encode a more diverse set of query types in comparison to previous work. Our method shows competitive performance against previous models for complex queries, and in contrast with these models, it can answer complex queries when trained for link prediction only. We show that the model learns entity embeddings that capture the notion of entity type without explicit supervision. Teaser video | [ protected link dropped ] Daniel Daza 🔗 - (#27 / Sess. 2) Learning Graph Structure with A Finite-State Automaton Layer (Poster Teaser)    Graph-based neural network models are producing strong results in a number of domains, in part because graphs provide flexibility to encode domain knowledge in the form of relational structure (edges). In practice, edges may represent intrinsic structure (e.g., abstract syntax trees of programs) or more abstract relations that aid reasoning (e.g., results of program analyses). In this work, we consider learning to derive abstract relations from intrinsic structure. Motivated by their power in program analyses, we consider relations defined by paths accepted by a finite-state automaton. We show how to learn these relations by transforming the problem into learning a policy on a graph-based POMDP with implicit differentiation. The result is a differentiable Graph Finite-State Automaton (GFSA) layer that adds a new edge type (expressed as a weighted adjacency matrix) to a base graph. We demonstrate that this layer performs well when either directly supervised or trained end-to-end for a downstream task. Teaser video | [ protected link dropped ] Daniel D Johnson 🔗 - (#28 / Sess. 1) Contrastive Graph Neural Network Explanation (Poster Teaser)    Graph Neural Networks achieve remarkable results on problems with structured data but come as black-box predictors. Transferring existing techniques, for example occlusion, to interpret models fails as even removing a single node or edge can lead to drastic changes in the graph. The resulting graphs can differ from all training examples, causing model confusion and wrong explanations. Thus, we argue that explicability must use graphs consistent with the distribution underlying the training data. We coin this property Distribution Compliant Explanation (DCE) and present a novel Contrastive GNN Explanation (CoGE) technique following this paradigm. An experimental study supports the efficacy of CoGE. Teaser video | [ protected link dropped ] Lukas Faber 🔗 - (#29 / Sess. 2) Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing (Poster Teaser)    Predicting interactions among heterogenous graph structured data has numerous applications such as knowledge graph completion, recommendation systems and drug discovery. Often times, the links to be predicted belong to rare types such as the case in repurposing drugs for novel diseases. This motivates the task of few-shot link prediction. Typically, GCNs are ill-equipped in learning such rare link types since the relation embedding is not learned in an inductive fashion. This paper proposes an inductive RGCN for learning informative relation embeddings even in the few-shot learning regime. The proposed inductive model significantly outperforms the RGCN and state-of-the-art KGE models in few-shot learning tasks. Furthermore, we apply our method on the drug-repurposing knowledge graph (DRKG) for discovering drugs for Covid-19. We pose the drug discovery task as link prediction and learn embeddings for the biological entities that partake in the DRKG. Our initial results corroborate that several drugs used in clinical trials were identified as possible drug candidates. Teaser video | [ protected link dropped ] Da Zheng 🔗 - (#31 / Sess. 2) Generalized Multi-Relational Graph Convolution Network (Poster Teaser) Graph Convolutional Networks (GCNs) have received increasing attention in recent machine learning. How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly optimizing the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the GEneralized Multi-relational Graph Convolutional Networks (GEM-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge-base embedding methods, and goes beyond. Our theoretical analysis shows that GEM-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of GEM-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification. [ protected link dropped ] Donghan Yu 🔗 - (#101 / Sess. 1) Graph neural induction of value iteration (Poster Teaser)    Many reinforcement learning tasks can benefit from explicit planning based on an internal model of the environment. Previously, such planning components have been incorporated through a neural network that partially aligns with the computational graph of value iteration. Such network have so far been focused on restrictive environments (e.g. grid-worlds), and modelled the planning procedure only indirectly. We relax these constraints, proposing a graph neural network (GNN) that executes the value iteration (VI) algorithm, across arbitrary environment models, with direct supervision on the intermediate steps of VI. The results indicate that GNNs are able to model value iteration accurately, recovering favourable metrics and policies across a variety of out-of-distribution tests. This suggests that GNN executors with strong supervision are a viable component within deep reinforcement learning systems. Teaser video | [ protected link dropped ] Andreea-Ioana Deac 🔗 - (#34 / Sess. 1) Set2Graph: Learning Graphs From Sets (Poster Teaser)    Many problems in machine learning (ML) can be cast as learning functions from sets to graphs, or more generally to hypergraphs; in short, Set2Graph functions. Examples include clustering, learning vertex and edge features on graphs, and learning features on triplets in a collection. A natural approach for building Set2Graph models is to characterize all linear equivariant set-to-hypergraph layers and stack them with non-linear activations. This posses two challenges: (i) the expressive power of these networks is not well understood; and (ii) these models would suffer from high, often intractable computational and memory complexity, as their dimension grows exponentially. This paper advocates a family of neural network models for learning Set2Graph functions that is both practical and of maximal expressive power (universal), that is, can approximate arbitrary continuous Set2Graph functions over compact sets. Testing these models on different machine learning tasks, mainly an application to particle physics, we find them favorable to existing baselines. Teaser video | [ protected link dropped ] Hadar Serviansky 🔗 - (#36 / Sess. 1) A Graph VAE and Graph Transformer Approach to Generating Molecular Graphs (Poster Teaser)    We propose a combination of a variational autoencoder and a transformer based model which fully utilises graph convolutional and graph pooling layers to operate directly on graphs. The transformer model implements a novel node encoding layer, replacing the position encoding typically used in transformers, to create a transformer with no position information that operates on graphs, encoding adjacent node properties into the edge generation process. The proposed model builds on graph generative work operating on graphs with edge features, creating a model that offers improved scalability with the number of nodes in a graph. In addition, our model is capable of learning a disentangled, interpretable latent space that represents graph properties through a mapping between latent variables and graph properties. In experiments we chose a benchmark task of molecular generation, given the importance of both generated node and edge features. Using the QM9 dataset we demonstrate that our model performs strongly across the task of generating valid, unique and novel molecules. Finally, we demonstrate that the model is interpretable by generating molecules controlled by molecular properties, and we then analyse and visualise the learned latent representation. Teaser video | [ protected link dropped ] Joshua Mitton 🔗 - (#99 / Sess. 2) GraphNets with Spectral Message Passing (Poster Teaser)    Graph Neural Networks (GNNs) are the subject of intense focus by the machine learning community for problems involving relational reasoning. GNNs can be broadly divided into spatial and spectral approaches. Spatial approaches use a form of learned message-passing, in which interactions among vertices are computed locally, and information propagates over longer distances on the graph with greater numbers of message-passing steps. Spectral approaches use eigendecompositions of the graph Laplacian to produce a generalization of spatial convolutions to graph structured data which access information over short and long time scales simultaneously. Here we introduce a Spectral Graph Network, which applies message passing to both the spatial and spectral domains. Our model projects vertices of the spatial graph onto the Laplacian eigenvectors, which are each represented as vertices in a fully connected spectral graph'', and then applies learned message passing to them. We apply this model to various benchmark tasks including a sparse graph-version of MNIST image classification, molecular classification (MoleculeNet), and molecular property prediction (QM9). The Spectral GN promotes efficient training, reaching high performance with fewer training iterations despite having more parameters. The model also provides robustness to edge dropout and outperforms baselines for the classification tasks. Teaser video | [ protected link dropped ] Kimberly Stachenfeld 🔗 - (#37 / Sess. 1) Geometric Matrix Completion: A Functional View (Poster Teaser) We propose a totally functional view of geometric matrix completion problem. Differently from existing work, we propose a novel regularization inspired from the functional map literature that is more interpretable and theoretically sound. On synthetic tasks with strong underlying geometric structure, our framework outperforms state of the art by a huge margin (two order of magnitude) demonstrating the potential of our approach. On real datasets, we achieve state-of-the-art results at a fraction of the computational effort of previous methods. Teaser video | [ protected link dropped ] Abhishek Sharma 🔗 - (#97 / Sess. 1) Clustered Dynamic Graph CNN for Biometric 3D Hand Shape Recognition (Poster Teaser)    The research in biometric recognition using hand shape has been somewhat stagnating in the last decade. Meanwhile, computer vision and machine learning have experienced a paradigm shift with the renaissance of deep learning, which has set the new state-of-the-art in many related fields. Inspired by successful applications of deep learning for other biometric modalities, we propose a novel approach to 3D hand shape recognition from RGB-D data based on geometric deep learning techniques. We show how to train our model on synthetic data and retain the performance on real samples during test time. To evaluate our method, we provide a new dataset NNHand RGB-D of short video sequences and show encouraging performance compared to diverse baselines on the new data, as well as current benchmark dataset HKPolyU. Moreover, the new dataset opens door to many new research directions in hand shape recognition. Teaser video | [ protected link dropped ] Jan Svoboda 🔗 - (#39 / Sess. 1) Hierarchically Attentive Graph Pooling with Subgraph Attention (Poster Teaser)    Graph neural networks get significant attention for graph representation and classification in machine learning community. Different types of neighborhood aggregation and pooling strategies have been proposed in the literature. In this work, we introduce a higher order hierarchical GNN algorithm (SubGattPool) by employing (i) an attention mechanism which learns the importance and aggregates neighboring subgraphs of a node instead of first-order neighbors, and (ii) a hierarchical pooling strategy which learns the importance of different hierarchies in a GNN. SubGattPool is able to achieve state-of-the-art graph classification performance on multiple real-world datasets. Teaser video | [ protected link dropped ] Sambaran Bandyopadhyay 🔗 - (#96 / Sess. 1) Active Learning on Graphs via Meta Learning (Poster Teaser)    Active learning (AL) for semi-supervised node classification aims to reduce the number of labeled instances by selecting only the most informative nodes for labeling. The AL algorithms designed for other data types such as images and text do not perform well on graph-structured data. Although a few heuristics-based AL algorithms have been proposed for graphs, a principled approach is lacking. We propose MetAL, an AL algorithm that selects unlabeled items that directly improve the future performance of a graph neural network (GNN) model. We formulate the AL problem as a bilevel optimization problem. Based on recent work in meta-learning, we compute the meta-gradients to approximate the impact of unlabeled instances on the model uncertainty. We empirically demonstrate that MetAL outperforms existing AL algorithms. Teaser video | [ protected link dropped ] Kaushalya Madhawa 🔗 - (#40 / Sess. 2) HNHN: Hypergraph Networks with Hyperedge Neurons (Poster Teaser)    Hypergraphs provide a natural representation for many real world datasets. We propose a novel framework, HNHN, for hypergraph representation learning. HNHN is a hypergraph convolution network with nonlinear activation functions applied to both hypernodes and hyperedges, combined with a normalization scheme that can flexibly adjust the importance of high-cardinality hyperedges and high-degree vertices depending on the dataset. We demonstrate improved performance of HNHN in both classification accuracy and speed on real world datasets when compared to state of the art methods. Teaser video | [ protected link dropped ] Yihe Dong 🔗 - (#95 / Sess. 2) Graphein - a Python Library for Geometric Deep Learning and Network Analysis on Protein Structures (Poster Teaser)    Graphein is a python library for constructinggraph and surface-mesh representations ofprotein structures for computational analysis. The library interfaces with popular geometric deep learning libraries: DGL, PyTorch Geometric and PyTorch3D. Geometric deep learning is emerging as a popular methodology in computational structural biology. As feature engineering is a vital step in a machine learning project, the library is designed to be highly flexible, allowing the user to parameterise the graph construction, scalable to facilitate working with large protein complexes, and containing useful pre-processing tools for preparing experimental structure files. Graphein is also designed to facilitate network-based and graph-theoretic analyses of protein structures in a high-throughput manner. As example workflows, we make available two new protein structure-related datasets, previously unused by the geometric deep learning community. Teaser video | [ protected link dropped ] Arian Jamasb 🔗 - (#43 / Sess. 2) Uncovering the Folding Landscape of RNA Secondary Structure with Deep Graph Embeddings (Poster Teaser)    Biomolecular graph analysis has recently gained much attention in the emerging field of geometric deep learning. Here we focus on organizing biomolecular graphs in ways that expose meaningful relations and variations between them. We propose a geometric scattering autoencoder (GSAE) network for learning such graph embeddings. Our embedding network first extracts rich graph features using the recently proposed geometric scattering transform. Then, it leverages a semi-supervised variational autoencoder to extract a low-dimensional embedding that retains the information in these features that enable prediction of molecular properties as well as characterize graphs. We show that GSAE organizes RNA graphs both by structure and energy, accurately reflecting bistable RNA structures. Also, the model is generative and can sample new folding trajectories. Teaser video | [ protected link dropped ] Egbert Castro 🔗 - (#94 / Sess. 2) Are Hyperbolic Representations in Graphs Created Equal? (Poster Teaser)    Recently there was an increasing interest in applications of graph neural networks in non-Euclidean geometry; however, are non-Euclidean representations always useful for graph learning tasks? For different problems such as node classification and link prediction we compute hyperbolic embeddings and conclude that for tasks that require global prediction consistency it might be useful to use non-Euclidean embeddings, while for other tasks Euclidean models are superior. To do so we first fix an issue of the existing models associated with the optimization process at zero curvature. Current hyperbolic models deal with gradients at the origin in ad-hoc manner, which is inefficient and can lead to numerical instabilities. We solve the instabilities of kappa-Stereographic model at zero curvature cases and evaluate the approach of embedding graphs into the manifold in several graph representation learning tasks. Teaser video | [ protected link dropped ] Maxim Kochurov 🔗 - (#45 / Sess. 1) Hierarchical Inter-Message Passing for Learning on Molecular Graphs (Poster Teaser)    We present a hierarchical neural message passing architecture for learning on molecular graphs. Our model takes in two complementary graph representations: the raw molecular graph representation and its associated junction tree, where nodes represent meaningful clusters in the original graph, e.g., rings or bridged compounds. We then proceed to learn a molecule's representation by passing messages inside each graph, and exchange messages between the two representations using a coarse-to-fine and fine-to-coarse information flow. Our method is able to overcome some of the restrictions known from classical GNNs, like detecting cycles, while still being very efficient to train. We validate its performance on the ZINC dataset and datasets stemming from the MoleculeNet benchmark collection. Teaser video | [ protected link dropped ] Matthias Fey 🔗 - (#93 / Sess. 1) Geoopt: Riemannian Optimization in PyTorch (Poster Teaser)    Geoopt is a research-oriented modular open-source package for Riemannian Optimization in PyTorch. The core of Geoopt is a standard Manifold interface that allows for the generic implementation of optimization algorithms. Geoopt supports basic Riemannian SGD as well as adaptive optimization algorithms. Geoopt also provides several algorithms and arithmetic methods for supported manifolds, which allow composing geometry-aware neural network layers that can be integrated with existing models. Teaser video | [ protected link dropped ] Maxim Kochurov 🔗 - (#46 / Sess. 2) Discrete Planning with End-to-end Trained Neuro-algorithmic Policies (Poster Teaser)    Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, most have been shown to be lacking in their generalization capabilities. These methods usually require sampling an exhaustive amount of data from different environment configurations. We propose a hybrid policy architecture with a deep network and a shortest path planner working in unison. The model can be trained end-to-end via blackbox-differentiation. The deep network learns to predict time-dependent way-costs such that internal plans match expert trajectories. These neuro-algorithmic policies generalize well to unseen environment configurations. Teaser video | [ protected link dropped ] Marin Vlastelica 🔗 - (#92 / Sess. 1) From Graph Low-Rank Global Attention to 2-FWL Approximation (Poster Teaser)    Graph Neural Networks are known to have an expressive power bounded by that of the vertex coloring algorithm \cite{xu2018how, morris2018weisfeiler}. However, for rich node features, such a bound does not exist and GNNs can be shown to be universal. Unfortunately, expressive power alone does not imply good generalization. We suggest the Low-Rank Global Attention (LRGA) module, taking advantage of the efficiency of low rank matrix-vector multiplication, that improves the algorithmic alignment \cite{Xu2019algoalign} of GNNs with the more powerful 2-folklore Weisfeiler-Lehman (FWL) algorithm. Furthermore, we provide a sample complexity bound for the module using kernel feature map interpretation of 2-FWL. Empirically, augmenting existing GNN layers with LRGA produces state of the art results on most datasets in a GNN standard benchmark. Teaser video | [ protected link dropped ] Omri Puny 🔗 - (#90 / Sess. 1) Pointer Graph Networks (Poster Teaser)    Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. We introduce Pointer Graph Networks (PGNs) which augment sets or graphs with additional inferred edges for improved model expressivity. PGNs allow each node to dynamically point to another node, followed by message passing over these pointers. Despite its sparsity, this adaptable graph structure proves sufficiently expressive to simulate complex algorithms. The pointing mechanism is supervised to model long-term sequences of operations on classical data structures. PGNs can learn parallelisable variants of pointer-based data structures, and generalise out-of-distribution to 5x larger test inputs on dynamic graph connectivity tasks, outperforming unrestricted GNNs and Deep Sets. Teaser video | [ protected link dropped ] Petar Veličković 🔗 - (#50 / Sess. 1) Graph Convolutional Gaussian Processes for Link Prediction (Poster Teaser)    Link prediction aims to reveal missing edges in a graph. We address this task with a deep graph convolutional Gaussian process model. The Gaussian process is transformed using simplified graph convolutions to better leverage the topological information of the graph domain. To scale the Gaussian process model to larger graphs, we introduce a variational inducing point method that places pseudo-inputs on a graph-structured domain. The proposed model represents the first Gaussian process for link prediction that can make use of both node features and topological information. We evaluate our model on three graph data sets with up to thousands of nodes and report consistent improvements over existing Gaussian process models and state-of-the-art graph neural network approaches. Teaser video | [ protected link dropped ] Felix Opolka 🔗 - (#89 / Sess. 1) Graphs, Entities, and Step Mixture (Poster Teaser)    Existing approaches for graph neural networks commonly suffer from the oversmoothing issue, regardless of how neighborhoods are aggregated. Most methods also focus on transductive scenarios for fixed graphs, leading to poor generalization for unseen graphs. To address these issues, we propose a new graph neural network that considers both edge-based neighborhood relationships and node-based entity features, i.e.Graph Entities with Step Mixture via random walk (GESM). GESM employs a mixture of various steps through random walk to alleviate the oversmoothing problem, attention to dynamically reflect interrelations depending on node information, and structure-based regularization to enhance embedding representation. With intensive experiments, we show that the proposed GESM achieves state-of-the-art or comparable performances on eight benchmark graph datasets. Teaser video | [ protected link dropped ] Kyuyong Shin 🔗 - (#51 / Sess. 1) Deep Lagrangian Propagation in Graph Neural Networks (Poster Teaser)    Graph Neural Networks (Scarselli et al., 2009) exploit an iterative diffusion procedure to compute the node states as the fixed point of the trainable state transition function. In this paper, we show how to cast this scheme as a constrained optimization problem, thus avoiding the unfolding procedure required for the computation of the fixed point. This is done by searching for saddle points of the Lagrangian function in the space of the weights, state variables and Lagrange multipliers. The proposed approach shows state-of-the-art performance in multiple standard benchmarks in graph domains. Teaser video | [ protected link dropped ] Matteo Tiezzi 🔗 - (#87 / Sess. 2) Bi-Level Attention Neural Architectures for Relational Data (Poster Teaser)    We present Bi-Level Attention-Based Relational Graph Convolutional Networks (BR-GCN), unique neural network architectures that utilize masked self-attentional layers with relational graph convolutions, to effectively operate on highly multi-relational data. BR-GCN models use bi-level attention to learn node embeddings through (1) node-level attention, and (2) relation-level attention. BR-GCN’s node-level self-attentional layers use intra-relational graph interactions to learn relation-specific node embeddings using a weighted aggregation of neighborhood features in a sparse subgraph region. BR-GCN's relation-level self-attentional layers use inter-relational graph interactions to learn the final node embeddings using a weighted aggregation of relation-specific node embeddings. BR-GCN's bi-level attention mechanism extends Transformer-based multiplicative attention from the natural language processing (NLP) domain, and Graph Attention Networks (GAT)-based attention, to large-scale heterogeneous graphs (HGs). On node classification, BR-GCN outperforms baselines from 0.29% to 14.95%, and on the link prediction task, outperforms baselines from 0.02% to 7.40% and suggests to enrich HG embedding models. We also conduct ablation studies to evaluate the quality of BR-GCN's relation-level attention and discuss how its learning of graph structure may be transferred to enrich other Graph Neural Networks (GNNs). Through various experiments, we show that BR-GCN's attention mechanism is both scalable and more effective in learning compared to state-of-the-art GNNs. Teaser video | [ protected link dropped ] Roshni Iyer 🔗 - (#53 / Sess. 1) Scene Graph Reasoning for Visual Question Answering (Poster Teaser)    Visual question answering is concerned with answering free-form questions about an image. Since it requires a deep linguistic understanding of the question and the ability to associate it with various objects that are present in the image, it is an ambitious task and requires techniques from both computer vision and natural language processing. We propose a novel method that approaches the task by performing context-driven, sequential reasoning based on the objects and their semantic and spatial relationships present in the scene. As a first step, we derive a scene graph which describes the objects in the image, as well as their attributes and their mutual relationships. A reinforcement agent then learns to autonomously navigate over the extracted scene graph to generate paths, which are then the basis for deriving answers. We conduct a first experimental study on the challenging GQA dataset with manually curated scene graphs, where our method reaches the level of human performance. Teaser video | [ protected link dropped ] Rajat Koner 🔗 - (#86 / Sess. 2) Graph Generation with Energy-Based Models (Poster Teaser)    We present a set of novel, energy-based models built on top of graph neural networks (GNN-EBMs) to estimate the unnormalized density of a distribution of graphs. GNN-EBMs can generate graphs implicitly via MCMC sampling. We compare the performance of GNN-EBMs trained using 3 different estimators: pseudolikelihood, conditional noise contrastive estimation, and persistent contrastive divergence (PCD). We find that all 3 estimators result in models that generalize well, while models trained with PCD generate samples that are competitive with state-of-the-art baselines. Finally, we discuss the potential of GNN-EBMs beyond generation for diverse tasks such as semi-supervised learning and outlier detection. Teaser video | [ protected link dropped ] Jenny Liu 🔗 - (#54 / Sess. 2) SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks (Poster Teaser)    We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds, which is equivariant under continuous 3D roto-translations. Equivariance is important to ensure stable and predictable performance in the presence of nuisance transformations of the data input. A positive corollary of equivariance is increased weight-tying within the model, leading to fewer trainable parameters and thus decreased sample complexity (i.e. we need less training data). The SE(3)-Transformer leverages the benefits of self-attention to operate on large point clouds with varying number of points while guaranteeing SE(3)-equivariance for robustness. We achieve competitive performance on two real-world datasets, ScanObjectNN and QM9. Teaser video | [ protected link dropped ] Fabian Fuchs 🔗 - (#84 / Sess. 2) UniKER: A Unified Framework for Combining Embedding and Horn Rules for Knowledge Graph Inference (Poster Teaser)    Combine KGE and logical rules for better KG inference have gained increasing attention in recent years. Unfortunately, a majority of existing employ sampling strategies to randomly select only a small portion of ground rules or hidden triples, thus only partially leverage the power of logical rules in reasoning. In this paper, we propose a novel framework UniKER to address this challenge by restricting logical rules to be Horn rules, which can fully exploit the knowledge in logical rules and enable the mutual enhancement of logical rule-based reasoning and KGE in an extremely efficient way. Extensive experiments have demonstrated that our approach is superior to existing state-of-the-art algorithms in terms of both efficiency and effectiveness. Teaser video | [ protected link dropped ] Kewei Cheng 🔗 - (#83 / Sess. 2) Connecting Graph Convolutional Networks and Graph-Regularized PCA (Poster Teaser) Graph convolution operator of the GCN model is originally motivated from a localized first-order approximation of spectral graph convolutions. This work stands on a different view; establishing a connection between graph convolution and graph-regularized PCA. Based on this connection, GCN architecture, shaped by stacking graph convolution layers, shares a close relationship with stacking graph-regularized PCA (GPCA). We empirically demonstrate that the unsupervised embeddings by GPCA paired with a logistic regression classifier achieves similar performance to GCN on semi-supervised node classification tasks. Further, we capitalize on the discovered relationship to design an effective initialization strategy for GCN based on stacking GPCA. Teaser video | [ protected link dropped ] Lingxiao Zhao 🔗 - (#55 / Sess. 1) Uncertainty in Neural Relational Inference Trajectory Reconstruction (Poster Teaser)    Neural networks used for multi-interaction trajectory reconstruction lack the ability to estimate the uncertainty in their outputs, which would be useful to better analyse and understand the systems they model. In this paper we extend the Factorised Neural Relational Inference model to output both a mean and a standard deviation for each component of the phase space vector, which together with an appropriate loss function, can account for uncertainty. A variety of loss functions are investigated including ideas from convexification and a Bayesian treatment of the problem. We show that the physical meaning of the variables is important when considering the uncertainty and demonstrate the existence of pathological local minima that are difficult to avoid during training. Teaser video | [ protected link dropped ] Vasileios Karavias 🔗 - (#82 / Sess. 2) Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks (Poster Teaser)    Graph convolutional networks (GCNs) are widely used for semi-supervised node classification on graphs today. The graph structure is however only accounted for by considering the similarity of activations between adjacent nodes, in turn degrading the results. In this work, we augment GCN models by incorporating richer notions of regularity by leveraging cascades of band-pass filters, known as geometric scatterings. We introduce a new hybrid architecture for the task and demonstrate its potential on multiple graph datasets, where it outperforms leading GCN models. Teaser video | [ protected link dropped ] Frederik Wenkel 🔗 - (#58 / Sess. 1) Temporal Graph Networks for Deep Learning on Dynamic Graphs (Poster Teaser)    Graph Neural Networks (GNNs) have become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches, being at the same time more computationally efficient. Teaser video | [ protected link dropped ] Emanuele Rossi 🔗 - (#80 / Sess. 2) Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings (Poster Teaser)    The 1-dimensional Weisfeiler-Leman (WL) algorithm has recently emerged as a powerful tool for analysis and design of kernels and neural architectures for (supervised) learning with graphs. However, due to the purely local nature of the algorithms, they might miss essential patterns. Here, the k-dimensional WL accounts for the higher-order interactions between vertices by defining a suitable notion of adjacency between k-tuples of vertices. However, it does not scale and may suffer from overfitting when used in a machine learning setting. We circumvent these issues by proposing new local variants of the k-WL and corresponding neural architectures, which consider a subset of the original neighborhood. The expressive power of (one of) our algorithms is strictly higher than the original k-WL with no losses in scalability. In fact, for sparse graphs, the scalability improves by several orders of magnitude. The kernel version establishes a new state-of-the-art for graph classification on a wide range of benchmark datasets, while the neural version shows promising performance on large-scale molecular regression tasks. Teaser video | [ protected link dropped ] Christopher Morris 🔗 - (#62 / Sess. 2) Relate and Predict: Structure-Aware Prediction with Jointly Optimized Neural Dependency Graph (Poster Teaser)    Understanding relationships between feature variables is one important way humans use to make decisions. However, state-of-the-art deep learning studies either focus on task-agnostic statistical dependency learning or do not model explicit feature dependencies during prediction. We propose a deep neural network framework, dGAP, to learn neural dependency Graph and optimize structure-Aware target Prediction simultaneously. dGAP trains towards a structure self-supervision loss and a target prediction loss jointly. Our method leads to an interpretable model that can disentangle sparse feature relationships, informing the user how relevant dependencies impact the target task. We empirically evaluate dGAP on multiple simulated and real datasets. dGAP is not only accurate, but can also recover correct dependency structure. Teaser video | [ protected link dropped ] Arshdeep Sekhon 🔗 - (#79 / Sess. 1) TUDataset: A collection of benchmark datasets for learning with graphs (Poster Teaser)    Recently, there has been an increasing interest in (supervised) learning with graph data, especially using graph neural networks. However, the development of meaningful benchmark datasets and standardized evaluation procedures is lagging, consequently hindering advancements in this area. To address this, we introduce the TUDataset for graph classification and regression. The collection consists of over 120 datasets of varying sizes from a wide range of applications. We provide Python-based data loaders, kernel and graph neural network baseline implementations, and evaluation tools. Here, we give an overview of the datasets, standardized evaluation procedures, and provide baseline experiments. All datasets are available at www.graphlearning.io. The experiments are fully reproducible from the code available at www.github.com/chrsmrrs/tudataset. Teaser video | [ protected link dropped ] Nils Kriege 🔗 - (#63 / Sess. 2) Stay Positive: Knowledge Graph Embedding Without Negative Sampling (Poster Teaser)    Knowledge graphs (KGs) are typically incomplete and we often wish to infer new facts given the existing ones. This can be thought of as a binary classification problem; we aim to predict if new facts are true or false. Unfortunately, we generally only have positive examples (the known facts) but we also need negative ones to train a classifier. To resolve this, it is usual to generate negative examples using a negative sampling strategy. However, this can produce false negatives which may reduce performance, is computationally expensive, and does not produce calibrated classification probabilities. In this paper, we propose a training procedure that obviates the need for negative sampling by adding a novel regularization term to the loss function. Our results for two relational embedding models (DistMult and SimplE) show the merit of our proposal both in terms of performance and speed. Teaser video | [ protected link dropped ] Ainaz Hajimoradlou 🔗 - (#77 / Sess. 1) SIGN: Scalable Inception Graph Neural Networks (Poster Teaser)    The popularity of graph neural networks has sparked interest, both in academia and in industry, in developing methods that scale to very large graphs such as Facebook or Twitter social networks. In most of these approaches, the computational cost is alleviated by a sampling strategy retaining a subset of node neighbors or subgraphs at training time. In this paper we propose a new, efficient and scalable graph deep learning architecture, which sidesteps the need for graph sampling by using graph convolutional filters of different size that are amenable to efficient precomputation, allowing extremely fast training and inference. Our architecture allows using different local graph operators (e.g. motif-induced adjacency matrices or Personalized Page Rank diffusion matrix) to best suit the task at hand. We conduct extensive experimental evaluation on various open benchmarks and show that our approach is competitive with other state-of-the-art architectures, while requiring a fraction of training and inference time. Teaser video | [ protected link dropped ] Fabrizio Frasca 🔗 - (#76 / Sess. 1) Graph Clustering with Graph Neural Networks (Poster Teaser)    Graph Neural Networks (GNNs) have achieved state-of-the-art results on many graph analysis tasks such as node classification and link prediction. However, important unsupervised problems on graphs, such as graph clustering, have proved more resistant to advances in GNNs. In this paper, we study unsupervised training of GNN pooling in terms of their clustering capabilities. We draw a connection between graph clustering and graph pooling: intuitively, a good graph clustering is what one would expect from a GNN pooling layer. Counterintuitively, we show that this is not true for state-of-the-art pooling methods, such as MinCut pooling. To address these deficiencies, we introduce Deep Modularity Networks (DMoN), an unsupervised pooling method inspired by the modularity measure of clustering quality, and show how it tackles recovery of the challenging clustering structure of graphs. In order to clarify the regimes where existing methods fail, we carefully design a set of experiments on synthetic data which show that DMoN is able to jointly leverage the signal from the graph structure and node features. Similarly, on real-world data, we show that DMoN produces high quality clusters which correlate strongly with ground truth labels, achieving state-of-the-art results. Teaser video | [ protected link dropped ] Anton Tsitsulin 🔗 - (#64 / Sess. 1) Differentiable Graph Module (DGM) for Graph Convolutional Networks (Poster Teaser)    Graph deep learning has recently emerged as a powerful ML concept allowing to generalize successful deep neural architectures to non-Euclidean structured data. Such methods have shown promising results on a broad spectrum of applications ranging from social science, biomedicine, and particle physics to computer vision, graphics, and chemistry. One of the limitations of the majority of the current graph neural network architectures is that they are often restricted to the transductive setting and rely on the assumption that the underlying graph is known and fixed. In many settings, such as those arising in medical and healthcare applications, this assumption is not necessarily true since the graph may be noisy, partially- or even completely unknown, and one is thus interested in inferring it from the data. This is especially important in inductive settings when dealing with nodes not present in the graph at training time. In this paper, we introduce Differentiable Graph Module (DGM), a learnable function predicting the edge probability in the graph relevant for the task, that can be combined with convolutional graph neural network layers and trained in an end-to-end fashion. We provide an extensive evaluation of applications from the domains of healthcare (disease prediction) and brain imaging (age prediction). We show that our model provides a significant improvement over baselines both in transductive and inductive settings and achieves state-of-the-art results. Teaser video | [ protected link dropped ] Anees Kazi 🔗 - (#75 / Sess. 1) Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting (Poster Teaser)    Graph Neural Networks (GNNs) have been recently found to suffer from important limitations regarding their ability to capture the structure of the underlying graph. It has been shown that the expressive power of standard GNNs is bounded by the Weisfeiler-Lehman (WL) graph isomorphism test, from which they inherit proven limitations such as the inability to detect and count graph substructures. On the other hand, there is significant empirical evidence that substructures are often informative for downstream tasks, suggesting that it is desirable to design GNNs capable of leveraging this important source of information. To this end, we propose a novel topologically-aware message passing scheme based on subgraph isomorphism counting. We show that our architecture allows incorporating domain-specific inductive biases and that it is strictly more expressive than the WL test, while being able to disambiguate even hard instances of graph isomorphism. In contrast to recent works, our approach does not attempt to adhere to the WL hierarchy and therefore retains the locality and linear complexity of standard GNNs. Teaser video | [ protected link dropped ] Georgios Bouritsas 🔗 - (#65 / Sess. 2) Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs (Poster Teaser)    We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events. Teaser video | [ protected link dropped ] Kian Ahrabian 🔗 - (#66 / Sess. 2) Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction (Poster Teaser)    We introduce Bi-GNN for modeling biological link prediction tasks such as drug-drug interaction (DDI) and protein-protein interaction (PPI). Taking drug-drug interaction as an example, existing methods using machine learning either only utilize the link structure between drugs without using the graph representation of each drug molecule, or only leverage the individual drug compound structures without using graph structure for the higher-level DDI graph. The key idea of our method is to fundamentally view the data as a bi-level graph, where the highest level is a representing the interaction between biological entities (interaction graph), and each biological entity itself is further expanded to its intrinsic graph representation (representation graphs), where the graph is either flat like a drug compound or hierarchical like a protein with amino acid level graph, secondary structure, tertiary structure, etc. Our model not only allows the usage of information from both the high-level interaction graph and the low-level representation graphs, but also offers a baseline for future research opportunities to address the bi-level nature of the data. Teaser video | [ protected link dropped ] Ken Gu 🔗 - (#74 / Sess. 2) Relation-Dependent Sampling for Multi-Relational Link Prediction (Poster Teaser)    Multi-relational graphs contain various types of relations that usually come with variable frequency and have different importance for the problem at hand. Existing graph sampling approaches ignore the multi-relational nature of such graphs. We propose an approach to modeling the importance of relation types for sampling and show that we can learn the right balance: relation-type probabilities that reflect both frequency and importance. We use relation-dependent sampling to develop a scalable graph neural network and apply it for multi-relational link prediction. Our experiments specifically consider drug-drug interaction (DDI) prediction, an important task during drug development. In that context, we further show the benefit of considering a special relation type, negative edges. Synergistic drug combinations (e.g., drugs that synergize by improved efficacy and reduced side effects) can be regarded as negative evidence for DDIs (which often indicate adverse reactions). We add drug synergy data to provide extra expert knowledge which can be easily integrated into our model and yields improved performance. Teaser video | [ protected link dropped ] Veronika Thost · Arthur Feeney · Rishabh Gupta 🔗 - (#67 / Sess. 2) Continuous Graph Flow (Poster Teaser)    In this paper, we propose Continuous Graph Flow, a generative continuous flow based method that aims to model complex distributions of graph-structured data. Our proposed model learns a joint probability density over a set of related random variables by formulating it as first order ordinary differential equation system with shared and reusable functions that operate over the graph structure. This leads to a reversible continuous message passing over time resulting in continuous transformations of probability distributions of the variables. We evaluate our model on a diverse set of generation tasks: graph generation, image puzzle generation, and layout generation from scene graphs. Experimental results show that CGF-based models outperform state-of-the-art graph generative models. Teaser video | [ protected link dropped ] Zhiwei Deng 🔗 - (#70 / Sess. 2) Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits (Poster Teaser)    One of the key challenges in automated chemical synthesis planning is to propose diverse and reliable reactions. A common approach is to generate reactions using reaction templates, which represent a reaction as a fixed graph transformation. This enables accurate and interpretable predictions but can suffer from limited diversity. On the other hand, template-free methods increase diversity but can be prone to making trivial mistakes. Inspired by the efficacy of reaction templates, we propose Molecule Edit Graph Attention Network (MEGAN), a template-free model that encodes reaction as a sequence of graph edits. Our model achieves state-of-the-art results on a standard retrosynthesis benchmark without any manual rule encoding. Teaser video | [ protected link dropped ] Mikołaj Sacha · Mikołaj Błaż 🔗 - (#73 / Sess. 2) Evaluating Logical Generalization in Graph Neural Networks (Poster Teaser)    Recent research has highlighted the role of relational inductive biases in building learning agents that can generalize and reason in a compositional manner. However, while relational learning algorithms such as graph neural networks (GNNs) show promise, we do not understand how effectively these approaches can adapt to new tasks. In this work, we study the task of logical generalization using GNNs by designing a benchmark suite grounded in first-order logic. Our benchmark suite, GraphLog, requires that learning algorithms perform rule induction in different synthetic logics, represented as knowledge graphs. GraphLog consists of relation prediction tasks on 57 distinct logical domains. We use GraphLog to evaluate GNNs in three different setups: single-task supervised learning, multi-task pretraining, and continual learning. Unlike previous benchmarks, our approach allows us to precisely control the logical relationship between the different tasks. We find that the ability for models to generalize and adapt is strongly determined by the diversity of the logical rules they encounter during training, and our results highlight new challenges for the design of GNN models. Teaser video | [ protected link dropped ] Koustuv Sinha 🔗 - (#71 / Sess. 1) Population Graph GNNs for Brain Age Prediction (Poster Teaser)    Many common neurological and neurodegenerative disorders, such as Alzheimer’s disease, dementia and multiple sclerosis, have been associated with abnormal patterns of apparent ageing of the brain. Discrepancies between the estimated brain age and the actual chronological age (brain age gaps) can be used to understand the biological pathways behind the ageing process, assess an individual’s risk for various brain disorders and identify new personalised treatment strategies. By flexibly integrating minimally preprocessed neuroimaging and non-imaging modalities into a population graph data structure, we train two types of graph neural network (GNN) architectures to predict brain age in a clinically relevant fashion as well as investigate their robustness to noisy inputs and graph sparsity. The population graph approach has the potential to learn from the entire cohort of healthy and affected subjects of both sexes at once, capturing a wide range of confounding effects and detecting variations in brain age trends between different sub-populations of subjects. Teaser video | [ protected link dropped ] Kamile Stankeviciute 🔗

#### Author Information

##### Petar Veličković (DeepMind)

Petar Veličković is a Staff Research Scientist at DeepMind, Affiliated Lecturer at the University of Cambridge, and an Associate of Clare Hall, Cambridge. He holds a PhD in Computer Science from the University of Cambridge (Trinity College), obtained under the supervision of Pietro Liò. Petar's research concerns geometric deep learning—devising neural network architectures that respect the invariances and symmetries in data (a topic he's co-written a proto-book about). For his contributions to the area, Petar is recognised as an ELLIS Scholar in the Geometric Deep Learning Program. Within this area, Petar focusses on graph representation learning and its applications in algorithmic reasoning and computational biology. In particular, he is the first author of Graph Attention Networks—a popular convolutional layer for graphs—and Deep Graph Infomax—a popular self-supervised learning pipeline for graphs (featured in ZDNet). Petar's research has been used in substantially improving the travel-time predictions in Google Maps (featured in the CNBC, Endgadget, VentureBeat, CNET, the Verge and ZDNet), and guiding the intuition of mathematicians towards new top-tier theorems and conjectures (featured in Nature, New Scientist, The Independent, Sky News, The Sunday Times and The Conversation).

##### Jessica Hamrick (DeepMind)

Jessica Hamrick is a Senior Research Scientist at DeepMind, where she studies how to build machines that can flexibly build and deploy models of the world as well as humans. Her work combines insights from cognitive science with structured relational architectures, model-based deep reinforcement learning, and planning. Jessica received her Ph.D. in Psychology from UC Berkeley, and her M.Eng. in Computer Science and Engineering from MIT.