Oral
Making Convolutional Networks Shift-Invariant Again
Richard Zhang

Wed Jun 12th 03:05 -- 03:10 PM @ Seaside Ballroom

Modern convolutional networks are not shift-invariant, despite their convolutional nature: small shifts in the input can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, ignore the classical sampling theorem. The well-known fix is applying a low-pass filter before downsampling. However, previous work has assumed that including such anti-aliasing filter necessarily \textit{excludes} max-pooling. We show that when integrated correctly, these operations are in fact \textit{compatible}. The technique is general and can be incorporated across other layer types, such as average-pooling and strided-convolution, and applications, such as image classification and translation. In addition, engineering the inductive bias of shift-equivariance largely removes the need for shift-based data augmentation at training time. Our results demonstrate that this classical signal processing technique has been overlooked in modern networks.

Author Information

Richard Zhang (Adobe)

Related Events (a corresponding poster, oral, or spotlight)