Skip to yearly menu bar Skip to main content


Poster

Near-optimal sample complexity bounds for learning Latent kpolytopes and applications to Ad-Mixtures

Chiranjib Bhattacharyya · Ravindran Kannan

Keywords: [ Combinatorial Optimization ] [ Generative Models ] [ Learning Theory ] [ Statistical Learning Theory ] [ Unsupervised and Semi-supervised Learning ]


Abstract: Deriving Optimal bounds on Sample Complexity of Latent Variable models is an active area of research. Recently such bounds were obtained for Mixture of Gaussians \cite{HSNCAY18}, no such results are known for Ad-mixtures, a generalization of Mixture distributions. In this paper we show that O(dk/m) samples are sufficient to learn each of k topic vectors of LDA, a popular Ad-mixture model, with vocabulary size d and mΩ(1) words per document, to any constant error in L1 norm. The result is a corollary of the major contribution of this paper: the first sample complexity upper bound for the problem (introduced in \cite{BK20}) of learning the vertices of a Latent k Polytope in \RRd, given perturbed points from it. The bound, O(dk/β), is optimal and linear in number of parameters. It applies to many stochastic models including a broad class Ad-mixtures. To demonstrate the generality of the approach we specialize the setting to Mixed Membership Stochastic Block Models(MMSB) and show for the first time that if an MMSB has k blocks, the sample complexity is O(k2) under usual assumptions.

Chat is not available.