( events)   Timezone: »  
Workshop
Sat Jul 23 05:45 AM -- 03:00 PM (PDT) @ Room 314 - 315
Complex feedback in online learning
Rémy Degenne · Pierre Gaillard · Wouter Koolen · Aadirupa Saha





While online learning has become one of the most successful and studied approaches in machine learning, in particular with reinforcement learning, online learning algorithms still interact with their environments in a very simple way.The complexity and diversity of the feedback coming from the environment in real applications is often reduced to the observation of a scalar reward. More and more researchers now seek to exploit fully the available feedback to allow faster and more human-like learning.This workshop aims to present a broad overview of the feedback types being actively researched, highlight recent advances and provide a networking forum for researchers and practitioners.

Opening remarks (Remarks)
Learning from Preference Feedback in Combinatorial Action Spaces (Invited Speaker)
Delayed Feedback in Generalised Linear Bandits Revisited (Invited Speaker)
Break
Online learning in digital markets (Invited speaker)
Beyond Learning from Demonstrations (Invited Speaker)
Near-optimal Regret for Adversarial MDP with Delayed Bandit Feedback (Oral)
Contextual Inverse Optimization: Offline and Online Learning (Oral)
Lunch Break (Break)
Decentralized Learning in Online Queuing Systems (Invited Speaker)
Giving Complex Feedback in Online Student Learning with Meta-Exploration (Oral)
Threshold Bandit Problem with Link Assumption between Pulls and Duels (Oral)
Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round (Oral)
Break
Poster session (Poster Session)
Prescriptive solutions in games: from theory to scale (Invited Speaker)
ActiveHedge: Hedge meets Active Learning (Oral)
Closing remarks (Remarks)
Optimal Parameter-free Online Learning with Switching Cost (Poster)
Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round (Poster)
Dynamical Linear Bandits for Long-Lasting Vanishing Rewards (Poster)
Giving Complex Feedback in Online Student Learning with Meta-Exploration (Poster)
Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP (Poster)
You Only Live Once: Single-Life Reinforcement Learning via Learned Reward Shaping (Poster)
ActiveHedge: Hedge meets Active Learning (Poster)
Big Control Actions Help Multitask Learning of Unstable Linear Systems (Poster)
On Adaptivity and Confounding in Contextual Bandit Experiments (Poster)
Provably Feedback-Efficient Reinforcement Learning via Active Reward Learning (Poster)
Online Learning with Off-Policy Feedback (Poster)
Beyond IID: data-driven decision-making in heterogeneous environments (Poster)
Stochastic Rising Bandits for Online Model Selection (Poster)
Provably Correct SGD-based Exploration for Linear Bandit (Poster)
Unimodal Mono-Partite Matching in a Bandit Setting (Poster)
Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms (Poster)
Near-optimal Regret for Adversarial MDP with Delayed Bandit Feedback (Poster)
Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk (Poster)
Threshold Bandit Problem with Link Assumption between Pulls and Duels (Poster)
Choosing Answers in Epsilon-Best-Answer Identification for Linear Bandits (Poster)
Interaction-Grounded Learning with Action-inclusive Feedback (Poster)
On the Importance of Critical Period in Multi-stage Reinforcement Learning (Poster)
Contextual Inverse Optimization: Offline and Online Learning (Poster)
Challenging Common Assumptions in Convex Reinforcement Learning (Poster)
Adversarial Attacks Against Imitation and Inverse Reinforcement Learning (Poster)