Skip to yearly menu bar Skip to main content


Generative Modeling of Infinite Occluded Objects for Compositional Scene Representation

Jinyang Yuan · Bin Li · Xiangyang Xue

Pacific Ballroom #139

Keywords: [ Deep Generative Models ] [ Computer Vision ]


We present a deep generative model which explicitly models object occlusions for compositional scene representation. Latent representations of objects are disentangled into location, size, shape, and appearance, and the visual scene can be generated compositionally by integrating these representations and an infinite-dimensional binary vector indicating presences of objects in the scene. By training the model to learn spatial dependences of pixels in the unsupervised setting, the number of objects, pixel-level segregation of objects, and presences of objects in overlapping regions can be estimated through inference of latent variables. Extensive experiments conducted on a series of specially designed datasets demonstrate that the proposed method outperforms two state-of-the-art methods when object occlusions exist.

Live content is unavailable. Log in and register to view live content