firstbacksecondback
Filter by Keyword:
705 Results
Spotlight
|
Tue 19:30 |
Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models Fan Bao · Kun Xu · Chongxuan Li · Lanqing Hong · Jun Zhu · Bo Zhang |
|
Poster
|
Tue 21:00 |
Discovering symbolic policies with deep reinforcement learning Mikel Landajuela Larma · Brenden Petersen · Sookyung Kim · Claudio Santiago · Ruben Glatt · Nathan Mundhenk · Jacob Pettit · Daniel Faissol |
|
Spotlight
|
Tue 17:45 |
Discovering symbolic policies with deep reinforcement learning Mikel Landajuela Larma · Brenden Petersen · Sookyung Kim · Claudio Santiago · Ruben Glatt · Nathan Mundhenk · Jacob Pettit · Daniel Faissol |
|
Poster
|
Tue 9:00 |
Counterfactual Credit Assignment in Model-Free Reinforcement Learning Thomas Mesnard · Theophane Weber · Fabio Viola · Shantanu Thakoor · Alaa Saade · Anna Harutyunyan · Will Dabney · Thomas Stepleton · Nicolas Heess · Arthur Guez · Eric Moulines · Marcus Hutter · Lars Buesing · Remi Munos |
|
Spotlight
|
Tue 7:40 |
Counterfactual Credit Assignment in Model-Free Reinforcement Learning Thomas Mesnard · Theophane Weber · Fabio Viola · Shantanu Thakoor · Alaa Saade · Anna Harutyunyan · Will Dabney · Thomas Stepleton · Nicolas Heess · Arthur Guez · Eric Moulines · Marcus Hutter · Lars Buesing · Remi Munos |
|
Spotlight
|
Tue 17:30 |
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization Julien Perolat · Remi Munos · Jean-Baptiste Lespiau · Shayegan Omidshafiei · Mark Rowland · Pedro Ortega · Neil Burch · Thomas Anthony · David Balduzzi · Bart De Vylder · Georgios Piliouras · Marc Lanctot · Karl Tuyls |
|
Poster
|
Tue 21:00 |
From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization Julien Perolat · Remi Munos · Jean-Baptiste Lespiau · Shayegan Omidshafiei · Mark Rowland · Pedro Ortega · Neil Burch · Thomas Anthony · David Balduzzi · Bart De Vylder · Georgios Piliouras · Marc Lanctot · Karl Tuyls |
|
Spotlight
|
Wed 17:35 |
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization Stanislaw Jastrzebski · Devansh Arpit · Oliver Astrand · Giancarlo Kerg · Huan Wang · Caiming Xiong · Richard Socher · Kyunghyun Cho · Krzysztof J Geras |
|
Poster
|
Wed 21:00 |
Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization Stanislaw Jastrzebski · Devansh Arpit · Oliver Astrand · Giancarlo Kerg · Huan Wang · Caiming Xiong · Richard Socher · Kyunghyun Cho · Krzysztof J Geras |