Skip to yearly menu bar Skip to main content


VarScene: A Deep Generative Model for Realistic Scene Graph Synthesis

Tathagat Verma · Abir De · Yateesh Agrawal · Vishwa Vinay · Soumen Chakrabarti

Hall E #300

Keywords: [ DL: Other Representation Learning ] [ DL: Graph Neural Networks ] [ DL: Generative Models and Autoencoders ]


Scene graphs are powerful abstractions that capture relationships between objects in images by modeling objects as nodes and relationships as edges.Generation of realistic synthetic scene graphs has applications like scene synthesis and data augmentation for supervised learning. Existing graph generative models are predominantly targeted toward molecular graphs, leveraging the limited vocabulary of atoms and bonds and also the well-defined semantics of chemical compounds. In contrast, scene graphs have much larger object and relation vocabularies, and their semantics are latent. To address this challenge, we propose a variational autoencoder for scene graphs, which is optimized for the maximum mean discrepancy (MMD) between the ground truth scene graph distribution and distribution of the generated scene graphs. Our method views a scene graph as a collection of star graphs and encodes it into a latent representation of the underlying stars. The decoder generates scene graphs by learning to sample the component stars and edges between them. Our experiments show that our method is able to mimic the underlying scene graph generative process more accurately than several state-of-the-art baselines.

Chat is not available.