Timezone: »
In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.
Author Information
Xiaotian Hao (College of Intelligence and Computing, Tianjin University)
Zhaoqing Peng (Alibaba Group)
Yi Ma (Tianjin University)
Guan Wang (Department of Automation, Tsinghua University)
Junqi Jin (Alibaba Group)
Jianye Hao (Tianjin University)
Shan Chen (Alibaba Group)
Rongquan Bai (Alibaba Group)
Mingzhou Xie (Alibaba Group)
Miao Xu (Alibaba Group)
Zhenzhe Zheng (Shanghai Jiao Tong University)
Chuan Yu (Alibaba Group)
HAN LI (Alibaba Group)
Jian Xu (Alibaba Group)
Kun Gai (Alibaba group)
More from the Same Authors
-
2021 : Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2022 Poster: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Poster: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2022 Spotlight: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2021 Poster: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Spotlight: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2020 Poster: Learning Optimal Tree Models under Beam Search »
Jingwei Zhuo · Ziru Xu · Wei Dai · Han Zhu · HAN LI · Jian Xu · Kun Gai -
2020 Poster: Q-value Path Decomposition for Deep Multiagent Reinforcement Learning »
Yaodong Yang · Jianye Hao · Guangyong Chen · Hongyao Tang · Yingfeng Chen · Yujing Hu · Changjie Fan · Zhongyu Wei