Timezone: »
Providing generalization guarantees for modern neural networks has been a crucial task in statistical learning. Recently, several studies have attempted to analyze the generalization error in such settings by using tools from fractal geometry. While these works have successfully introduced new mathematical tools to apprehend generalization, they heavily rely on a Lipschitz continuity assumption, which in general does not hold for neural networks and might make the bounds vacuous. In this work, we address this issue and prove fractal geometry-based generalization bounds without requiring any Lipschitz assumption. To achieve this goal, we build up on a classical covering argument in learning theory and introduce a data-dependent fractal dimension. Despite introducing a significant amount of technical complications, this new notion lets us control the generalization error (over either fixed or random hypothesis spaces) along with certain mutual information (MI) terms. To provide a clearer interpretation to the newly introduced MI terms, as a next step, we introduce a notion of `geometric stability' and link our bounds to the prior art. Finally, we make a rigorous connection between the proposed data-dependent dimension and topological data analysis tools, which then enables us to compute the dimension in a numerically efficient way. We support our theory with experiments conducted on various settings.
Author Information
Benjamin Dupuis (INRIA, ENS)
George Deligiannidis (Oxford)
Umut Simsekli (Inria/ENS)
More from the Same Authors
-
2023 Poster: Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions »
Anant Raj · Lingjiong Zhu · Mert Gurbuzbalaban · Umut Simsekli -
2022 Poster: Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers »
Liam Hodgkinson · Umut Simsekli · Rajiv Khanna · Michael Mahoney -
2022 Spotlight: Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers »
Liam Hodgkinson · Umut Simsekli · Rajiv Khanna · Michael Mahoney -
2021 Poster: Differentiable Particle Filtering via Entropy-Regularized Optimal Transport »
Adrien Corenflos · James Thornton · George Deligiannidis · Arnaud Doucet -
2021 Oral: Differentiable Particle Filtering via Entropy-Regularized Optimal Transport »
Adrien Corenflos · James Thornton · George Deligiannidis · Arnaud Doucet -
2021 Poster: Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections »
Alexander D Camuto · Xiaoyu Wang · Lingjiong Zhu · Christopher Holmes · Mert Gurbuzbalaban · Umut Simsekli -
2021 Spotlight: Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections »
Alexander D Camuto · Xiaoyu Wang · Lingjiong Zhu · Christopher Holmes · Mert Gurbuzbalaban · Umut Simsekli -
2021 Poster: The Heavy-Tail Phenomenon in SGD »
Mert Gurbuzbalaban · Umut Simsekli · Lingjiong Zhu -
2021 Spotlight: The Heavy-Tail Phenomenon in SGD »
Mert Gurbuzbalaban · Umut Simsekli · Lingjiong Zhu -
2021 Poster: Relative Positional Encoding for Transformers with Linear Complexity »
Antoine Liutkus · Ondřej Cífka · Shih-Lun Wu · Umut Simsekli · Yi-Hsuan Yang · Gaël RICHARD -
2021 Oral: Relative Positional Encoding for Transformers with Linear Complexity »
Antoine Liutkus · Ondřej Cífka · Shih-Lun Wu · Umut Simsekli · Yi-Hsuan Yang · Gaël RICHARD -
2020 Poster: Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows »
Rob Cornish · Anthony Caterini · George Deligiannidis · Arnaud Doucet -
2019 Poster: Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets »
Rob Cornish · Paul Vanetti · Alexandre Bouchard-Côté · George Deligiannidis · Arnaud Doucet -
2019 Oral: Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets »
Rob Cornish · Paul Vanetti · Alexandre Bouchard-Côté · George Deligiannidis · Arnaud Doucet