4   Show all »
Toggle Poster Visibility
Oral
Wed Jul 11th 05:00 -- 05:20 PM @ A1
Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning
Ronan Fruit · Matteo Pirotta · Alessandro Lazaric · Ronald Ortner
Oral
Wed Jul 11th 05:20 -- 05:40 PM @ A1
Path Consistency Learning in Tsallis Entropy Regularized MDPs
Yinlam Chow · Ofir Nachum · Mohammad Ghavamzadeh
Oral
Wed Jul 11th 05:40 -- 05:50 PM @ A1
Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems
Marc Abeille · Alessandro Lazaric
Oral
Wed Jul 11th 05:50 -- 06:00 PM @ A1
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Stephen Tu · Benjamin Recht