Workshop: New Frontiers in Adversarial Machine Learning

Adversarial robustness of $\beta-$VAE through the lens of local geometry

Asif Khan · Amos Storkey

Abstract: Variational autoencoders (VAEs) are susceptible to adversarial attacks. An adversary can find a small perturbation in the input sample to change its latent encoding non-smoothly, thereby compromising the reconstruction. A known reason for such vulnerability is the latent space distortions arising from a mismatch between approximated latent posterior and a prior distribution. As a result, a slight change in the inputs leads to a significant change in the latent space encodings. This paper demonstrates that the sensitivity at any given input exploits the directional bias of a stochastic pullback metric tensor induced by the encoder network. The pullback metric tensor captures how the infinitesimal region changes from the input to the latent space. Thus, it can be viewed as a lens to analyse distortions in the latent space. We propose evaluation scores using the eigenspectrum of a pullback metric. Moreover, we empirically show that the scores correlate with the robustness parameter $\beta$ of the $\beta-$VAE.

Chat is not available.