Skip to yearly menu bar Skip to main content


Oral

Generative Adversarial User Model for Reinforcement Learning Based Recommendation System

Xinshi Chen · Shuang Li · Hui Li · Shaohua Jiang · Yuan Qi · Le Song

[ ] [ Visit Time Series ]
[ Slides [ Video

Abstract:

We proposed a novel model-based reinforcement learning framework for recommendation systems, where we developed a GAN formulation to model user behavior dynamics and her associated reward function. Using this user model as the simulation environment, we develop a novel cascading Q-network for combinatorial recommendation policy which can handle a large number of candidate items efficiently. Although the experiments show clear benefits of our method in an offline and realistic simulation setting, even stronger results could be obtained via future online A/B testing.

Chat is not available.