Workshop
Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs
Theresa Eimer · Raghu Rajan · Julian Dierkes · André Biedenkapp · Vu Nguyen · Aleksandra Faust
Stolz 0
Sat 27 Jul, midnight PDT
The past few years has seen a surge of interest in reinforcement learning, with breakthrough successes of applying RL in games, robotics, chemistry, logistics, nuclear fusion and more. These headlines, however, blur the picture of what remains a brittle technology,with many successes relying on heavily engineered solutions. Indeed, several recent works have demonstrated that RL algorithms are brittle to seemingly mundane design choices. Thus, it is often a significant challenge to effectively apply RL in practice, especially on novel problems, limiting its potential impact and narrowing its accessibility. In this workshop, we want to bring together different communities working on solving these problems. A variety of distinct sub-communities spanning RL, Meta-Learning and AutoML havebeen working on making RL work “out-of-the-box” in arbitrary settings - this is the AutoRL setting. Recently, with the emergence of LLMs and their in-context learning abilities, they have significantly impacted all these communities. There are LLM agents tacklingtraditional RL tasks as well as few-shot RL agents increasing efficiency and generalization that arealso trying to automate RL. LLMs have also been influencing AutoML directly with papers such as OptFormer. However, there is currently little crossover between these communities. As such, we want to create the space to connect them and cross-pollinate ideas automating RL. We believe closer connections between these communities will ultimately lead to faster and more focused progress on AutoRL and an in-person workshop is the ideal way to allow for greater interaction between them. Through a mixture of diverse expert talks and opportunity for conversation, we hope to emphasize the many facets of current AutoRL approaches and where collaboration across fields is possible.
Schedule
Sat 12:00 a.m. - 12:30 a.m.
|
Chelsea Finn
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 12:30 a.m. - 12:45 a.m.
|
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
(
Contributed Talk
)
>
SlidesLive Video |
🔗 |
Sat 12:45 a.m. - 1:00 a.m.
|
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL
(
Contributed Talk
)
>
SlidesLive Video |
🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning ( Poster ) > link | Anthony Liang · Guy Tennenholtz · Chih-wei Hsu · Yinlam Chow · Erdem Biyik · Craig Boutilier 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Recursive Introspection: Teaching Foundation Model Agents How to Self-Improve ( Poster ) > link | Yuxiao Qu · Tianjun Zhang · Naman Garg · Aviral Kumar 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Conditional Meta-Reinforcement Learning with State Representation ( Poster ) > link | YUXUAN SUN · Laura Toni · Yiannis Andreopoulos 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Learning In-Context Decision Making with Synthetic MDPs ( Poster ) > link | Akarsh Kumar · Christopher Lu · Louis Kirsch · Phillip Isola 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment ( Poster ) > link | Shenao Zhang · Donghan Yu · Hiteshi Sharma · Ziyi Yang · Shuohang Wang · Hany Hassan Awadalla · Zhaoran Wang 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? ( Poster ) > link | Denis Tarasov · Kirill Brilliantov · Dmitrii Kharlapenko 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search ( Poster ) > link | Jonathan Light · Min Cai · WEIQIN CHEN · Guanzhi Wang · Xiusi Chen · Wei Cheng · Yisong Yue · ziniu hu 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL ( Poster ) > link | Yu-Heng Hung · Kai-Jie Lin · Yu-Heng Lin · Chien-Yi Wang · Ping-Chun Hsieh 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Skill-Enhanced Reinforcement Learning Acceleration from Demonstrations ( Poster ) > link | Hanping Zhang · Yuhong Guo 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models ( Poster ) > link | Cong Lu · Shengran Hu · Jeff Clune 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels ( Poster ) > link | Zhuorui Ye · Stephanie Milani · Fei Fang · Geoff Gordon 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL ( Poster ) > link | Eduardo Pignatelli · Johan Ferret · Davide Paglieri · Samuel Coward · Tim Rocktäschel · Edward Grefenstette · Laura Toni 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Trace is the New AutoDiff — Unlocking Efficient Optimization of Computational Workflows ( Poster ) > link | Ching-An Cheng · Allen Nie · Adith Swaminathan 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for Efficiency ( Poster ) > link | Yanxiao Zhao · Yangge Qian · Tianyi Wang · Jingyang Shan · Xiaolin Qin 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Scalable and Provable Exploration via HyperAgent for Foundation Model Decision-making ( Poster ) > link | Yingru Li · Jiawei Xu · Zhi-Quan Luo 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Discovering Preference Optimization Algorithms with and for Large Language Models ( Poster ) > link | Christopher Lu · Samuel Holt · Claudio Fanconi · Alexander Chan · Jakob Foerster · M van der Schaar · Robert Lange 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Vision-Language Models Provide Promptable Representations for Reinforcement Learning ( Poster ) > link | William Chen · Oier Mees · Aviral Kumar · Sergey Levine 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning ( Poster ) > link | Yifei Zhou · Hao Bai · Mert Cemri · Jiayi Pan · Alane Suhr · Sergey Levine · Aviral Kumar 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning ( Poster ) > link | Théo Vincent · Fabian Wahren · Jan Peters · Boris Belousov · Carlo D'Eramo 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX ( Poster ) > link | Alexander Nikulin · Vladislav Kurenkov · Ilya Zisman · Artem Agarkov · Viacheslav Sinii · Sergey Kolesnikov 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Distilling LLMs’ Decomposition Abilities into Compact Language Models ( Poster ) > link | Denis Tarasov · Kumar Shridhar 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Can Learned Optimization Make Reinforcement Learning Less Difficult? ( Poster ) > link | Alexander D. Goldie · Matthew T Jackson · Christopher Lu · Jakob Foerster · Shimon Whiteson 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Unfamiliar Finetuning Examples Control How Language Models Hallucinate ( Poster ) > link | Katie Kang · Eric Wallace · Claire Tomlin · Aviral Kumar · Sergey Levine 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making ( Poster ) > link | Chuanhao Li · Runhan Yang · Tiankai Li · Milad Bafarassat · Kourosh Sharifi · Dirk Bergemann · Zhuoran Yang 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Higher Order and Self-Referential Evolution for Population-based Methods ( Poster ) > link | Samuel Coward · Christopher Lu · Alistair Letcher · Minqi Jiang · Jack Parker-Holder · Jakob Foerster 🔗 |
Sat 1:00 a.m. - 2:00 a.m.
|
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity ( Poster ) > link | Vahid Balazadeh · Keertana Chidambaram · Viet Nguyen · Rahul G. Krishnan · Vasilis Syrgkanis 🔗 |
Sat 2:00 a.m. - 2:30 a.m.
|
Roberta Raileanu: Learning to Solve New sequential decision-making Tasks with In-Context Learning
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 2:30 a.m. - 3:00 a.m.
|
Pierluca D'Oro: AI-Assisted Agent Design with Large Language Models and Reinforcement Learning
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 3:00 a.m. - 3:30 a.m.
|
What about X for AutoRL?
(
Breakout Session
)
>
|
🔗 |
Sat 5:00 a.m. - 5:30 a.m.
|
Michael Dennis: Genie: Generative Interactive Environments
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 5:30 a.m. - 5:45 a.m.
|
Can Learned Optimization Make Reinforcement Learning Less Difficult?
(
Contributed Talk
)
>
SlidesLive Video |
🔗 |
Sat 5:45 a.m. - 6:15 a.m.
|
Pablo Samuel Castro: In defense of Atari: The ALE as a benchmark for AutoRL
(
Invited Talk
)
>
SlidesLive Video |
🔗 |
Sat 7:00 a.m. - 8:00 a.m.
|
Future Perspectives in AutoRL (Pablo Samuel Castro, Alexander Goldie, Jacob Beck & Doina Precup)
(
Panel Discussion
)
>
SlidesLive Video |
🔗 |