Timezone: »
Recent sharpness-aware minimisation (SAM) is known to find flat minima which is beneficial for better generalisation with improved robustness. SAM essentially modifies the loss function by the maximum loss value within the small neighborhood around the current iterate. However, it uses the Euclidean ball to define the neighborhood, which can be less accurate since loss functions for neural networks are typically defined over probability distributions (e.g., class predictive probabilities), rendering the parameter space no more Euclidean. In this paper we consider the information geometry of the model parameter space when defining the neighborhood, namely replacing SAM's Euclidean balls with ellipsoids induced by the Fisher information. Our approach, dubbed Fisher SAM, defines more accurate neighborhood structures that conform to the intrinsic metric of the underlying statistical manifold. For instance, SAM may probe the worst-case loss value at either a too nearby or inappropriately distant point due to the ignorance of the parameter space geometry, which is avoided by our Fisher SAM. Another recent Adaptive SAM approach that stretches/shrinks the Euclidean ball in accordance with the scales of the parameter magnitudes, might be dangerous, potentially destroying the neighborhood structure even severely. We demonstrate the improved performance of the proposed Fisher SAM on several benchmark datasets/tasks.
Author Information
Minyoung Kim (Samsung AI Center)
Da Li (Samsung)
Xu Hu (Ecole des Ponts ParisTech)
Timothy Hospedales (Samsung AI Centre / University of Edinburgh)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Fisher SAM: Information Geometry and Sharpness Aware Minimisation »
Wed. Jul 20th through Thu the 21st Room Hall E #512
More from the Same Authors
-
2022 : Attacking Adversarial Defences by Smoothing the Loss Landscape »
Panagiotis Eustratiadis · Henry Gouk · Da Li · Timothy Hospedales -
2022 : HyperInvariances: Amortizing Invariance Learning »
Ruchika Chavhan · Henry Gouk · Jan Stuehmer · Timothy Hospedales -
2022 : Feed-Forward Source-Free Latent Domain Adaptation via Cross-Attention »
Ondrej Bohdal · Da Li · Xu Hu · Timothy Hospedales -
2023 : Impact of Noise on Calibration and Generalisation of Neural Networks »
Martin Ferianc · Ondrej Bohdal · Timothy Hospedales · Miguel Rodrigues -
2023 : Evaluating the Evaluators: Are Current Few-Shot Learning Benchmarks Fit for Purpose? »
LuĂsa Shimabucoro · Timothy Hospedales · Henry Gouk -
2023 : Why Do Self-Supervised Models Transfer? On Data Augmentation and Feature Properties »
Linus Ericsson · Henry Gouk · Timothy Hospedales -
2022 Poster: Loss Function Learning for Domain Generalization by Implicit Gradient »
Boyan Gao · Henry Gouk · Yongxin Yang · Timothy Hospedales -
2022 Spotlight: Loss Function Learning for Domain Generalization by Implicit Gradient »
Boyan Gao · Henry Gouk · Yongxin Yang · Timothy Hospedales -
2021 Poster: Weight-covariance alignment for adversarially robust neural networks »
Panagiotis Eustratiadis · Henry Gouk · Da Li · Timothy Hospedales -
2021 Spotlight: Weight-covariance alignment for adversarially robust neural networks »
Panagiotis Eustratiadis · Henry Gouk · Da Li · Timothy Hospedales -
2019 Poster: Analogies Explained: Towards Understanding Word Embeddings »
Carl Allen · Timothy Hospedales -
2019 Oral: Analogies Explained: Towards Understanding Word Embeddings »
Carl Allen · Timothy Hospedales -
2019 Poster: Feature-Critic Networks for Heterogeneous Domain Generalization »
Yiying Li · Yongxin Yang · Wei Zhou · Timothy Hospedales -
2019 Oral: Feature-Critic Networks for Heterogeneous Domain Generalization »
Yiying Li · Yongxin Yang · Wei Zhou · Timothy Hospedales -
2018 Poster: Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data »
Minyoung Kim -
2018 Oral: Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data »
Minyoung Kim