Information-Theoretic Generalization Bounds for VAEs: A Role of Encoder and Latent Variable
Abstract
Despite their remarkable success, a rigorous theoretical understanding of how latent variables (LVs) govern the generalization performance of Variational Autoencoders (VAEs) remains largely elusive. Existing theoretical analyses are confined to supervised learning or models with discrete latent spaces, leaving their role in standard VAEs with continuous LVs poorly understood. This paper establishes the first information-theoretic analysis for VAEs by adapting a theoretical framework from supervised learning---the leave-one-out conditional mutual information framework---to the unsupervised, continuous latent space of these models. Our analysis reveals that their generalization error is bounded solely by the information complexity of the encoder and LVs, independent of the decoder. The versatility of our framework is demonstrated through its extension to both hierarchical VAEs, for which we provide layer-wise bounds, and data generation, where we link our information-theoretic principles to a novel bound on the 2-Wasserstein distance between true and generated distributions.