firstbacksecondback
10 Results
Workshop
|
Fri 8:00 |
Off-Policy Evaluation from Logged Human Feedback Aniruddha Bhargava · Lalit Jain · Branislav Kveton · Ge Liu · Subhojyoti Mukherjee |
|
Poster
|
Wed 2:30 |
Fair Off-Policy Learning from Observational Data Dennis Frauen · Valentyn Melnychuk · Stefan Feuerriegel |
|
Workshop
|
Safer Reinforcement Learning by Going Off-policy: a Benchmark Igor Kuznetsov |
||
Poster
|
Wed 2:30 |
RVI-SAC: Average Reward Off-Policy Deep Reinforcement Learning Yukinari Hisaki · Isao Ono |
|
Poster
|
Wed 2:30 |
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization Tianying Ji · Yongyuan Liang · Yan Zeng · Yu Luo · Guowei Xu · Jiawei Guo · Ruijie Zheng · Furong Huang · Fuchun Sun · Huazhe Xu |
|
Poster
|
Thu 2:30 |
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic Tianying Ji · Yu Luo · Fuchun Sun · Xianyuan Zhan · Jianwei Zhang · Huazhe Xu |
|
Oral
|
Wed 2:15 |
ACE: Off-Policy Actor-Critic with Causality-Aware Entropy Regularization Tianying Ji · Yongyuan Liang · Yan Zeng · Yu Luo · Guowei Xu · Jiawei Guo · Ruijie Zheng · Furong Huang · Fuchun Sun · Huazhe Xu |
|
Poster
|
Wed 4:30 |
Off-policy Evaluation Beyond Overlap: Sharp Partial Identification Under Smoothness Samir Khan · Martin Saveski · Johan Ugander |
|
Poster
|
Thu 2:30 |
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL Yu Luo · Tianying Ji · Fuchun Sun · Jianwei Zhang · Huazhe Xu · Xianyuan Zhan |
|
Poster
|
Wed 2:30 |
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation Fengdi Che · Chenjun Xiao · Jincheng Mei · Bo Dai · Ramki Gummadi · Oscar Ramirez · Christopher Harris · Rupam Mahmood · Dale Schuurmans |