firstbacksecondback
44 Results
Oral
|
Tue 10:30 |
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution Vihang Patil · Markus Hofmarcher · Marius-Constantin Dinu · Matthias Dorfer · Patrick Blies · Johannes Brandstetter · Jose A. Arjona-Medina · Sepp Hochreiter |
|
Poster
|
Tue 15:30 |
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution Vihang Patil · Markus Hofmarcher · Marius-Constantin Dinu · Matthias Dorfer · Patrick Blies · Johannes Brandstetter · Jose A. Arjona-Medina · Sepp Hochreiter |
|
Oral
|
Tue 8:00 |
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP Liyu Chen · Rahul Jain · Haipeng Luo |
|
Poster
|
Tue 15:30 |
Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP Liyu Chen · Rahul Jain · Haipeng Luo |
|
Poster
|
Wed 15:30 |
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency Qi Cai · Zhuoran Yang · Zhaoran Wang |
|
Poster
|
Thu 15:00 |
Lagrangian Method for Q-Function Learning (with Applications to Machine Translation) Huang Bojun |
|
Spotlight
|
Wed 11:15 |
Reinforcement Learning from Partial Observation: Linear Function Approximation with Provable Sample Efficiency Qi Cai · Zhuoran Yang · Zhaoran Wang |
|
Poster
|
Wed 15:30 |
Reachability Constrained Reinforcement Learning Dongjie Yu · Haitong Ma · Shengbo Li · Jianyu Chen |
|
Spotlight
|
Thu 11:50 |
Lagrangian Method for Q-Function Learning (with Applications to Machine Translation) Huang Bojun |
|
Spotlight
|
Wed 14:45 |
Reachability Constrained Reinforcement Learning Dongjie Yu · Haitong Ma · Shengbo Li · Jianyu Chen |
|
Spotlight
|
Wed 7:50 |
Generalized Data Distribution Iteration Jiajun Fan · Changnan Xiao |
|
Poster
|
Thu 15:00 |
Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach Shuang Wu · Ling Shi · Jun Wang · Guangjian Tian |