Timezone: »
The theme of this workshop will be real world applications of reinforcement learning. We will give a demo/tutorial of the latest features and additions to Azure Personalizer, an award winning and easy to use RL cloud service (https://azure.microsoft.com/en-us/services/cognitive-services/personalizer/). The first release of Personalizer was presented in an ICML 2019 workshop. We will present the latest additions to the service, including multi-slot personalization.
We will demonstrate how to leverage the latest release of Vowpal Wabbit, an open source machine learning library (https://vowpalwabbit.org/). It provides fast, scalable machine learning and has unique capabilities such as learning to search, active learning, contextual memory, and extreme multiclass learning. It has a focus on reinforcement learning and provides production ready implementations of Contextual Bandit algorithms.
This part will include:
- Demo on COBA - benchmarking framework for CB algorithms (https://github.com/VowpalWabbit/coba)
- Using Panda Dataframes with pyVW
- AutoML & pyVW
- Hands on demo of continuous actions with VW
- Integrating VW with Apache Spark
Sun 5:00 a.m. - 5:30 a.m.
|
Introduction
(
Talk
)
SlidesLive Video » |
John Langford 🔗 |
Sun 5:30 a.m. - 6:00 a.m.
|
Microsoft Azure Personalizer
(
Talk
)
SlidesLive Video » |
Sheetal Lahabar 🔗 |
Sun 6:00 a.m. - 6:30 a.m.
|
CCB vs Slates
(
Talk
)
SlidesLive Video » |
Pavithra Srinath 🔗 |
Sun 6:30 a.m. - 7:00 a.m.
|
AutoML
(
Talk
)
SlidesLive Video » |
Qingyun Wu · Qingyun Wu 🔗 |
Sun 7:00 a.m. - 7:30 a.m.
|
Continuous Actions in VW
(
Talk
)
SlidesLive Video » |
Olga Vrousgou 🔗 |
Sun 7:30 a.m. - 8:00 a.m.
|
COBA
(
Talk
)
SlidesLive Video » |
Mark Rucker 🔗 |
Sun 8:00 a.m. - 8:30 a.m.
|
DFtoVW: using Panda Dataframes with pyvw
(
Talk
)
SlidesLive Video » |
Etienne Kintzler 🔗 |
Sun 8:30 a.m. - 9:00 a.m.
|
VW & Apache Spark
(
Talk
)
SlidesLive Video » |
Bogdan Mazoure 🔗 |
Sun 9:00 a.m. - 9:30 a.m.
|
VW update
(
Talk
)
SlidesLive Video » |
Jack Gerrits 🔗 |
Author Information
Sheetal Lahabar
Etienne Kintzler (-)
Mark Rucker
Bogdan Mazoure (MILA,McGill University)
Qingyun Wu
Pavithra Srinath (Microsoft Research)
Jack Gerrits
Olga Vrousgou (Microsoft Research)
John Langford (Microsoft Research)
Eduardo Salinas (Microsoft)
More from the Same Authors
-
2021 : Provable RL with Exogenous Distractors via Multistep Inverse Dynamics »
Yonathan Efroni · Dipendra Misra · Akshay Krishnamurthy · Alekh Agarwal · John Langford -
2022 : Interaction-Grounded Learning with Action-inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2023 Workshop: Interactive Learning with Implicit Human Feedback »
Andi Peng · Akanksha Saran · Andreea Bobu · Tengyang Xie · Pierre-Yves Oudeyer · Anca Dragan · John Langford -
2023 Tutorial: Discovering Agent-Centric Latent States in Theory and in Practice »
John Langford · Alex Lamb -
2023 Expo Talk Panel: Vowpal Wabbit: year in review and looking ahead in an LLM world »
John Langford · Byron Xu · Cheng Tan · Jack Gerrits · Lili Wu · Mark Rucker · Olga Vrousgou -
2022 Poster: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2022 Poster: Contextual Bandits with Large Action Spaces: Made Practical »
Yinglun Zhu · Dylan Foster · John Langford · Paul Mineiro -
2022 Spotlight: Contextual Bandits with Large Action Spaces: Made Practical »
Yinglun Zhu · Dylan Foster · John Langford · Paul Mineiro -
2022 Spotlight: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2022 : What is new in Vowpal Wabbit 9 »
Eduardo Salinas · Mark Rucker · Zakaria Mhammedi -
2022 : Anatomy of Vowpal Wabbit. Reductions cookbook »
Olga Vrousgou · Cheng Tan -
2022 : Introduction »
John Langford -
2021 : RL Foundation Panel »
Matthew Botvinick · Thomas Dietterich · Leslie Kaelbling · John Langford · Warrren B Powell · Csaba Szepesvari · Lihong Li · Yuxi Li -
2021 Workshop: Theory and Foundation of Continual Learning »
Thang Doan · Bogdan Mazoure · Amal Rannen Triki · Rahaf Aljundi · Vincenzo Lomonaco · Xu He · Arslan Chaudhry -
2021 Poster: Interaction-Grounded Learning »
Tengyang Xie · John Langford · Paul Mineiro · Ida Momennejad -
2021 Spotlight: Interaction-Grounded Learning »
Tengyang Xie · John Langford · Paul Mineiro · Ida Momennejad -
2021 Poster: ChaCha for Online AutoML »
Qingyun Wu · Chi Wang · John Langford · Paul Mineiro · Marco Rossi -
2021 Spotlight: ChaCha for Online AutoML »
Qingyun Wu · Chi Wang · John Langford · Paul Mineiro · Marco Rossi -
2021 Town Hall: Town Hall »
John Langford · Marina Meila · Tong Zhang · Le Song · Stefanie Jegelka · Csaba Szepesvari -
2021 : VW update »
Jack Gerrits -
2021 : VW & Apache Spark »
Bogdan Mazoure -
2021 : DFtoVW: using Panda Dataframes with pyvw »
Etienne Kintzler -
2021 : Continuous Actions in VW »
Olga Vrousgou -
2021 : AutoML »
Qingyun Wu · Qingyun Wu -
2021 : CCB vs Slates »
Pavithra Srinath -
2021 : Microsoft Azure Personalizer »
Sheetal Lahabar -
2020 : Discussion Panel »
Krzysztof Dembczynski · Prateek Jain · Alina Beygelzimer · Inderjit Dhillon · Anna Choromanska · Maryam Majzoubi · Yashoteja Prabhu · John Langford -
2020 Workshop: Workshop on eXtreme Classification: Theory and Applications »
Anna Choromanska · John Langford · Maryam Majzoubi · Yashoteja Prabhu -
2020 Poster: Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning »
Dipendra Kumar Misra · Mikael Henaff · Akshay Krishnamurthy · John Langford -
2020 Poster: Adaptive Estimator Selection for Off-Policy Evaluation »
Yi Su · Pavithra Srinath · Akshay Krishnamurthy -
2019 : panel discussion with Craig Boutilier (Google Research), Emma Brunskill (Stanford), Chelsea Finn (Google Brain, Stanford, UC Berkeley), Mohammad Ghavamzadeh (Facebook AI), John Langford (Microsoft Research) and David Silver (Deepmind) »
Peter Stone · Craig Boutilier · Emma Brunskill · Chelsea Finn · John Langford · David Silver · Mohammad Ghavamzadeh -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 : invited talk by John Langford (Microsoft Research): How do we make Real World Reinforcement Learning revolution? »
John Langford -
2019 Poster: Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback »
Chicheng Zhang · Alekh Agarwal · Hal Daumé III · John Langford · Sahand Negahban -
2019 Oral: Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback »
Chicheng Zhang · Alekh Agarwal · Hal Daumé III · John Langford · Sahand Negahban -
2019 Poster: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Poster: Contextual Memory Trees »
Wen Sun · Alina Beygelzimer · Hal Daumé III · John Langford · Paul Mineiro -
2019 Oral: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Oral: Contextual Memory Trees »
Wen Sun · Alina Beygelzimer · Hal Daumé III · John Langford · Paul Mineiro -
2018 Poster: Learning Deep ResNet Blocks Sequentially using Boosting Theory »
Furong Huang · Jordan Ash · John Langford · Robert Schapire -
2018 Oral: Learning Deep ResNet Blocks Sequentially using Boosting Theory »
Furong Huang · Jordan Ash · John Langford · Robert Schapire -
2017 Poster: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Talk: Contextual Decision Processes with low Bellman rank are PAC-Learnable »
Nan Jiang · Akshay Krishnamurthy · Alekh Agarwal · John Langford · Robert Schapire -
2017 Poster: Logarithmic Time One-Against-Some »
Hal Daumé · Nikos Karampatziakis · John Langford · Paul Mineiro -
2017 Poster: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford -
2017 Talk: Active Learning for Cost-Sensitive Classification »
Akshay Krishnamurthy · Alekh Agarwal · Tzu-Kuo Huang · Hal Daumé III · John Langford -
2017 Talk: Logarithmic Time One-Against-Some »
Hal Daumé · Nikos Karampatziakis · John Langford · Paul Mineiro -
2017 Tutorial: Real World Interactive Learning »
Alekh Agarwal · John Langford