Complex feedback in online learning

Workshop

Complex feedback in online learning

Rémy Degenne · Pierre Gaillard · Wouter Koolen · Aadirupa Saha

Room 314 - 315

Sat 23 Jul, 5:45 a.m. PDT

[ Abstract ] Workshop Website

While online learning has become one of the most successful and studied approaches in machine learning, in particular with reinforcement learning, online learning algorithms still interact with their environments in a very simple way.The complexity and diversity of the feedback coming from the environment in real applications is often reduced to the observation of a scalar reward. More and more researchers now seek to exploit fully the available feedback to allow faster and more human-like learning.This workshop aims to present a broad overview of the feedback types being actively researched, highlight recent advances and provide a networking forum for researchers and practitioners.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 5:45 a.m. - 6:00 a.m.	Opening remarks ( Remarks ) >	🔗
Sat 6:00 a.m. - 6:30 a.m.	Learning from Preference Feedback in Combinatorial Action Spaces ( Invited Speaker ) > SlidesLive Video	Thorsten Joachims 🔗
Sat 6:30 a.m. - 7:00 a.m.	Delayed Feedback in Generalised Linear Bandits Revisited ( Invited Speaker ) > SlidesLive Video	Ciara Pike-Burke 🔗
Sat 7:00 a.m. - 7:30 a.m.	Break	🔗
Sat 7:30 a.m. - 8:00 a.m.	Online learning in digital markets ( Invited speaker ) > SlidesLive Video	Nicolò Cesa-Bianchi 🔗
Sat 8:00 a.m. - 8:30 a.m.	Beyond Learning from Demonstrations ( Invited Speaker ) > SlidesLive Video	Andreea Bobu 🔗
Sat 8:30 a.m. - 8:50 a.m.	Near-optimal Regret for Adversarial MDP with Delayed Bandit Feedback ( Oral ) > SlidesLive Video	Tiancheng Jin · Tal Lancewicki · Haipeng Luo · Yishay Mansour · Aviv Rosenberg 🔗
Sat 8:50 a.m. - 9:10 a.m.	Contextual Inverse Optimization: Offline and Online Learning ( Oral ) > SlidesLive Video	Omar Besbes · Yuri Fonseca · Ilan Lobel 🔗
Sat 9:10 a.m. - 10:30 a.m.	Lunch Break	🔗
Sat 10:30 a.m. - 11:00 a.m.	Decentralized Learning in Online Queuing Systems ( Invited Speaker ) > link SlidesLive Video Link	Vianney Perchet 🔗
Sat 11:00 a.m. - 11:20 a.m.	Giving Complex Feedback in Online Student Learning with Meta-Exploration ( Oral ) > SlidesLive Video	Evan Liu · Moritz Stephan · Allen Nie · Chris Piech · Emma Brunskill · Chelsea Finn 🔗
Sat 11:20 a.m. - 11:40 a.m.	Threshold Bandit Problem with Link Assumption between Pulls and Duels ( Oral ) > SlidesLive Video	Keshav Narayan · Aarti Singh 🔗
Sat 11:40 a.m. - 12:00 p.m.	Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round ( Oral ) > SlidesLive Video	Manh Hung Nguyen · Lisheng Sun · Nathan Grinsztajn · Isabelle Guyon 🔗
Sat 12:00 p.m. - 12:30 p.m.	Break	🔗
Sat 12:30 p.m. - 1:30 p.m.	Poster session ( Poster Session ) >	🔗
Sat 1:30 p.m. - 2:00 p.m.	Prescriptive solutions in games: from theory to scale ( Invited Speaker ) > SlidesLive Video	Julien Perolat 🔗
Sat 2:00 p.m. - 2:20 p.m.	ActiveHedge: Hedge meets Active Learning ( Oral ) > SlidesLive Video	Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama 🔗
Sat 2:20 p.m. - 2:30 p.m.	Closing remarks ( Remarks ) >	🔗
-	Optimal Parameter-free Online Learning with Switching Cost ( Poster ) >	Zhiyu Zhang · Ashok Cutkosky · Ioannis Paschalidis 🔗
-	Challenging Common Assumptions in Convex Reinforcement Learning ( Poster ) >	Mirco Mutti · Riccardo De Santi · Piersilvio De Bartolomeis · Marcello Restelli 🔗
-	Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms ( Poster ) > SlidesLive Video	MohammadJavad Azizi · Thang Duong · Yasin Abbasi-Yadkori · Claire Vernade · Andras Gyorgy · Mohammad Ghavamzadeh 🔗
-	Provably Correct SGD-based Exploration for Linear Bandit ( Poster ) > SlidesLive Video	Jialin Dong · Lin Yang 🔗
-	You Only Live Once: Single-Life Reinforcement Learning via Learned Reward Shaping ( Poster ) >	Annie Chen · Archit Sharma · Sergey Levine · Chelsea Finn 🔗
-	Stochastic Rising Bandits for Online Model Selection ( Poster ) > SlidesLive Video	Alberto Maria Metelli · Francesco Trovò · Matteo Pirola · Marcello Restelli 🔗
-	Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP ( Poster ) > SlidesLive Video	Orin Levy · Yishay Mansour 🔗
-	Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk ( Poster ) > SlidesLive Video	Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama 🔗
-	Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning ( Poster ) > SlidesLive Video	Dingwen Kong · Lin Yang 🔗
-	Dynamical Linear Bandits for Long-Lasting Vanishing Rewards ( Poster ) >	Marco Mussi · Alberto Maria Metelli · Marcello Restelli 🔗
-	Online Learning with Off-Policy Feedback ( Poster ) > SlidesLive Video	Germano Gabbianelli · Matteo Papini · Gergely Neu 🔗
-	On Adaptivity and Confounding in Contextual Bandit Experiments ( Poster ) >	Chao Qin · Daniel Russo 🔗
-	Unimodal Mono-Partite Matching in a Bandit Setting ( Poster ) >	Matthieu Rodet · Romaric Gaudel 🔗
-	Beyond IID: data-driven decision-making in heterogeneous environments ( Poster ) >	Omar Besbes · Will Ma · Omar Mouchtaki 🔗
-	Big Control Actions Help Multitask Learning of Unstable Linear Systems ( Poster ) > SlidesLive Video	Aditya Modi · Ziping Xu · Mohamad Kazem Shirani Faradonbeh · Ambuj Tewari 🔗
-	Adversarial Attacks Against Imitation and Inverse Reinforcement Learning ( Poster ) >	Ezgi Korkmaz 🔗
-	Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits ( Poster ) > SlidesLive Video	Marc Jourdan · Rémy Degenne 🔗
-	Interaction-Grounded Learning with Action-inclusive Feedback ( Poster ) > SlidesLive Video	Tengyang Xie · Akanksha Saran · Dylan Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford 🔗
-	On the Importance of Critical Period in Multi-stage Reinforcement Learning ( Poster ) > SlidesLive Video	Junseok Park · Inwoo Hwang · Min Whoo Lee · Hyunseok Oh · Minsu Lee · Youngki Lee · Byoung-Tak Zhang 🔗
-	ActiveHedge: Hedge meets Active Learning ( Poster ) > SlidesLive Video	Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama 🔗
-	Near-optimal Regret for Adversarial MDP with Delayed Bandit Feedback ( Poster ) >	Tiancheng Jin · Tal Lancewicki · Haipeng Luo · Yishay Mansour · Aviv Rosenberg 🔗
-	Giving Complex Feedback in Online Student Learning with Meta-Exploration ( Poster ) > SlidesLive Video	Evan Liu · Moritz Stephan · Allen Nie · Chris Piech · Emma Brunskill · Chelsea Finn 🔗
-	Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round ( Poster ) >	Manh Hung Nguyen · Lisheng Sun · Nathan Grinsztajn · Isabelle Guyon 🔗
-	Threshold Bandit Problem with Link Assumption between Pulls and Duels ( Poster ) > SlidesLive Video	Keshav Narayan · Aarti Singh 🔗
-	Contextual Inverse Optimization: Offline and Online Learning ( Poster ) > SlidesLive Video	Omar Besbes · Yuri Fonseca · Ilan Lobel 🔗