Skip to yearly menu bar Skip to main content


Equivariant Transformer Networks

Kai Sheng Tai · Peter Bailis · Gregory Valiant

Pacific Ballroom #18

Keywords: [ Representation Learning ] [ Computer Vision ] [ Architectures ] [ Algorithms ]


How can prior knowledge on the transformation invariances of a domain be incorporated into the architecture of a neural network? We propose Equivariant Transformers (ETs), a family of differentiable image-to-image mappings that improve the robustness of models towards pre-defined continuous transformation groups. Through the use of specially-derived canonical coordinate systems, ETs incorporate functions that are equivariant by construction with respect to these transformations. We show empirically that ETs can be flexibly composed to improve model robustness towards more complicated transformation groups in several parameters. On a real-world image classification task, ETs improve the sample efficiency of ResNet classifiers, achieving relative improvements in error rate of up to 15% in the limited data regime while increasing model parameter count by less than 1%.

Live content is unavailable. Log in and register to view live content