Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC
Yulai Cong · Bo Chen · Hongwei Liu · Mingyuan Zhou

It is challenging to develop stochastic gradient based scalable inference for deep discrete latent variable models (LVMs), due to the difficulties in not only computing the gradients, but also adapting the step sizes to different latent factors and hidden layers. For the Poisson gamma belief network (PGBN), a recently proposed deep discrete LVM, we derive an alternative representation that is referred to as deep latent Dirichlet allocation (DLDA). Exploiting data augmentation and marginalization techniques, we derive a block-diagonal Fisher information matrix and its inverse for the simplex-constrained global model parameters of DLDA. Exploiting that Fisher information matrix with stochastic gradient MCMC, we present topic-layer-adaptive stochastic gradient Riemannian (TLASGR) MCMC that jointly learns simplex-constrained global parameters across all layers and topics, with topic and layer specific learning rates. State-of-the-art results are demonstrated on big data sets.

Bo Chen, Ph.D., Professor. Before joining the Department of Electronic Engineering in Xidian University in 2013, I was a post-doc researcher, research scientist and senior research scientist at the Department of Electrical and Computer Engineering in Duke University. In 2013 and 2014, I was elected into the Program for New Century Excellent Talents in University and the Program for Thousand Youth Talents respectively. I am interested in developing statistical machine learning methods for the complex and large-scale data. My current interests are in statistical signal processing, statistical machine learning, deep learning and their applications to radar target detection and recognition.

