Skip to yearly menu bar Skip to main content


Oral

Convolutional Poisson Gamma Belief Network

CHAOJIE WANG · Bo Chen · SUCHENG XIAO · Mingyuan Zhou

Abstract:

To analyze a text corpus, one often resorts to a lossy representation that either completely ignores word order or embeds the words as low-dimensional dense feature vectors. In this paper, we propose convolutional Poisson factor analysis (CPFA) that directly operates on a lossless representation that processes the words in each document as a sequence of high-dimensional one-hot vectors. To boost its performance, we further propose the convolutional Poisson gamma belief network (CPGBN) that couples CPFA with the gamma belief network via a novel probabilistic pooling layer. CPFA forms words into phrases and captures very specific phrase-level topics, and CPGBN further builds a hierarchy of increasingly more general phrase-level topics. We develop both an upward-downward Gibbs sampler, which makes the computation feasible by exploiting the extreme sparsity of the one-hot vectors, and a Weibull distribution based convolutional variational auto-encoder that makes CPGBN become even more scalable in both training and testing. Experimental results demonstrate that CPGBN can extract high-quality text latent representations that capture the word order information, and hence can be leveraged as a building block to enrich a wide variety of existing discrete latent variable models that ignore word order.

Chat is not available.