Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process
Zhibin Duan · Xinyang Liu · Yudi Su · Yishi Xu · Bo Chen · Mingyuan Zhou

Deep topic models have shown an impressive ability to extract multi-layer document latent representations and discover hierarchical semantically meaningful topics.However, most deep topic models are limited to the single-step generative process, despite the fact that the progressive generative process has achieved impressive performance in modeling image data. To this end, in this paper, we propose a novel progressive deep topic model that consists of a knowledge-informed textural data coarsening process and a corresponding progressive generative model. The former is used to build multi-level observations ranging from concrete to abstract, while the latter is used to generate more concrete observations gradually. Additionally, we incorporate a graph-enhanced decoder to capture the semantic relationships among words at different levels of observation. Furthermore, we perform a theoretical analysis of the proposed model based on the principle of information theory and show how it can alleviate the well-known "latent variable collapse" problem. Finally, extensive experiments demonstrate that our proposed model effectively improves the ability of deep topic models, resulting in higher-quality latent document representations and topics.

Author Information

Zhibin Duan (Xidian University)
Xinyang Liu (Xidian University)
Yudi Su (xidian university )
Yishi Xu (Xidian University)
Bo Chen (School of Electronic Engineering, Xidian University)

Bo Chen, Ph.D., Professor. Before joining the Department of Electronic Engineering in Xidian University in 2013, I was a post-doc researcher, research scientist and senior research scientist at the Department of Electrical and Computer Engineering in Duke University. In 2013 and 2014, I was elected into the Program for New Century Excellent Talents in University and the Program for Thousand Youth Talents respectively. I am interested in developing statistical machine learning methods for the complex and large-scale data. My current interests are in statistical signal processing, statistical machine learning, deep learning and their applications to radar target detection and recognition.

Mingyuan Zhou (University of Texas at Austin)

