Poster
in
Workshop: Geometry-grounded Representation Learning and Generative Modeling
Scalable Local Intrinsic Dimension Estimation with Diffusion Models
Hamidreza Kamkari · Brendan Ross · Rasa Hosseinzadeh · Jesse Cresswell · Gabriel Loaiza-Ganem
Keywords: [ Diffusion Models ] [ manifold hypothesis ] [ Intrinsic dimension estimation ]
High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. While many estimation techniques exist, they are all either inaccurate or do not scale. In this work, we show that the Fokker-Planck equation associated with a diffusion model can provide the first LID estimator which scales to high dimensional data while outperforming existing baselines on LID estimation benchmarks.