Decision Awareness in Reinforcement Learning

Workshop

Decision Awareness in Reinforcement Learning

Evgenii Nikishin · Pierluca D'Oro · Doina Precup · Andre Barreto · Amir-massoud Farahmand · Pierre-Luc Bacon

Hall G

Fri 22 Jul, 6 a.m. PDT

[ Abstract ] Workshop Website

The goal of reinforcement learning (RL) is to maximize a reward signal by taking optimal decisions. An RL system typically contains several moving components, possibly including a policy, a value function, and a model of the environment. We refer to decision awareness as the notion that each of the components and their combination should be explicitly trained to help the agent improve the total amount of collected reward. To better understand decision awareness, consider as an example a model-based method. For environments with rich observations (e.g., pixel-based), the world model is complex and standard approaches would need a large number of samples and a high-capacity function approximator to learn a reasonable approximation of the dynamics. However, a decision-aware agent might recognize that modeling all the granular complexity of the environment is neither feasible nor necessary to learn an optimal policy and instead focus on modeling aspects that are important for decision making. Decision awareness goes beyond the model learning aspect. In actor-critic algorithms, a critic is trained to predict the expected return while later used to aid policy learning. Is return prediction an optimal strategy for critic learning? And, in general, what is the best way to learn each component of an RL system? Our workshop aims at answering these questions and articulating that decision awareness might be a key towards solving grand challenges in RL, including exploration and sample efficiency. The workshop is about decision-aware RL algorithms, their implications, and real-world applications; we focus on decision-aware objectives, end-to-end procedures, and meta-learning techniques for training and discovering components in modular RL systems, as well as theoretical or empirical analyses of the interaction among multiple modules used by RL algorithms.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Fri 6:00 a.m. - 5:00 p.m.	Please visit the workshop website for the full program ( Program ) > link Link	🔗
Fri 6:00 a.m. - 6:20 a.m.	Opening Remarks ( Presentation ) > SlidesLive Video	🔗
Fri 6:20 a.m. - 7:00 a.m.	Differentiable optimization for control and reinforcement learning ( Invited Talk ) > SlidesLive Video	Brandon Amos 🔗
Fri 7:00 a.m. - 7:30 a.m.	Break	🔗
Fri 7:30 a.m. - 8:10 a.m.	Discovering RL Algorithms ( Invited Talk ) > SlidesLive Video	Junhyuk Oh 🔗
Fri 8:10 a.m. - 9:00 a.m.	Discovered Policy Optimisation. Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy. Adaptive Interest for Emphatic Reinforcement Learning ( Contributed Talks ) >	🔗
Fri 9:00 a.m. - 10:40 a.m.	Break	🔗
Fri 10:40 a.m. - 11:20 a.m.	The Value Equivalence Principle for Model-Based RL ( Invited Talk ) > SlidesLive Video	Christopher Grimm 🔗
Fri 11:20 a.m. - 12:00 p.m.	A Model-Based Reinforcement Learning Wishlist ( Invited Talk ) > SlidesLive Video	Erin Talvitie 🔗
Fri 12:00 p.m. - 12:30 p.m.	Break	🔗
Fri 12:30 p.m. - 1:30 p.m.	DARL Panel ( Panel Discussion ) > SlidesLive Video	🔗
Fri 1:30 p.m. - 2:30 p.m.	Poster Session ( In-person only poster presentation ) >	🔗
Fri 2:30 p.m. - 3:10 p.m.	Policy Gradient: Theory for Making Best Use of It ( Invited Talk ) > SlidesLive Video	Mengdi Wang 🔗
Fri 3:10 p.m. - 3:50 p.m.	General-purpose meta learning ( Invited Talk ) > SlidesLive Video	Louis Kirsch 🔗
Fri 3:50 p.m. - 5:00 p.m.	Closing Remarks & Poster Session ( Presentation followed by an In-person only poster presentation ) >	🔗
-	Effective Offline RL Needs Going Beyond Pessimism: Representations and Distributional Shift ( Poster ) > link Link	Xinyang Geng · Kevin Li · Abhishek Gupta · Aviral Kumar · Sergey Levine 🔗
-	Hyperbolically Discounted Advantage Estimation for Generalization in Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Nasik Muhammad Nafi · Raja Farrukh Ali · William Hsu 🔗
-	Deep Policy Generators ( Poster ) > link Link	Francesco Faccio · Vincent Herrmann · Aditya Ramesh · Louis Kirsch · Jürgen Schmidhuber 🔗
-	CoMBiNED: Multi-Constrained Model Based Planning for Navigation in Dynamic Environments ( Poster ) > link SlidesLive Video Link	Harit Pandya · Rudra Poudel · Stephan Liwicki 🔗
-	Exploration Hurts in Bandits with Partially Observed Stochastic Contexts ( Poster ) > link Link	Hongju Park · Mohamad Kazem Shirani Faradonbeh 🔗
-	Exploration in Reward Machines with Low Regret ( Poster ) > link SlidesLive Video Link	Hippolyte Bourel · Anders Jonsson · Odalric-Ambrym Maillard · Mohammad Sadegh Talebi 🔗
-	Exploring Long-Horizon Reasoning with Deep RL in Combinatorially Hard Tasks ( Poster ) > link Link	Andrew C Li · Pashootan Vaezipoor · Rodrigo A Toro Icarte · Sheila McIlraith 🔗
-	VIPer: Iterative Value-Aware Model Learning on the Value Improvement Path ( Poster ) > link SlidesLive Video Link	Romina Abachi · Claas Voelcker · Animesh Garg · Amir-massoud Farahmand 🔗
-	Model-Based Meta Automatic Curriculum Learning ( Poster ) > link SlidesLive Video Link	Zifan Xu · Yulin Zhang · Shahaf Shperberg · Reuth Mirsky · Yuqian Jiang · Bo Liu · Peter Stone 🔗
-	Adaptive Interest for Emphatic Reinforcement Learning ( Spotlight ) > link SlidesLive Video Link	Martin Klissarov · Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Taesup Kim · Alex Smola 🔗
-	General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States ( Poster ) > link Link	Francesco Faccio · Aditya Ramesh · Vincent Herrmann · Jean Harb · Jürgen Schmidhuber 🔗
-	An Investigation into the Open World Survival Game Crafter ( Poster ) > link SlidesLive Video Link	Aleksandar Stanic · Yujin Tang · David Ha · Jürgen Schmidhuber 🔗
-	Unsupervised Model-based Pre-training for Data-efficient Reinforcement Learning from Pixels ( Poster ) > link SlidesLive Video Link	Sai Rajeswar · Pietro Mazzaglia · Tim Verbelen · Alex Piche · Bart Dhoedt · Aaron Courville · Alexandre Lacoste 🔗
-	Model-Based Reinforcement Learning with SINDy ( Poster ) > link SlidesLive Video Link	Rushiv Arora · Eliot Moss · Bruno da Silva 🔗
-	Toward Human Cognition-inspired High-Level Decision Making For Hierarchical Reinforcement Learning Agents ( Poster ) > link SlidesLive Video Link	Rousslan F. J. Dossa · Takashi Matsubara 🔗
-	MoCoDA: Model-based Counterfactual Data Augmentation ( Poster ) > link SlidesLive Video Link	Silviu Pitis · Elliot Creager · Ajay Mandlekar · Animesh Garg 🔗
-	An Adaptive Entropy-Regularization Framework for Multi-Agent Reinforcement Learning ( Poster ) > link SlidesLive Video Link	WOOJUN KIM · Youngchul Sung 🔗
-	Leader-based Decision Learning for Cooperative Multi-Agent Reinforcement Learning ( Poster ) > link Link	Wenqi Chen · Xin Zeng · Amber Li 🔗
-	Recursive History Representations for Unsupervised Reinforcement Learning in Multiple-Environments ( Poster ) > link Link	Mirco Mutti · Pietro Maldini · Riccardo De Santi · Marcello Restelli 🔗
-	Building a Subspace of Policies for Scalable Continual Learning ( Poster ) > link SlidesLive Video Link	Jean-Baptiste Gaya · Thang Doan · Lucas Caccia · Laure Soulier · Ludovic Denoyer · Roberta Raileanu 🔗
-	DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Quan Vuong · Aviral Kumar · Sergey Levine · Yevgen Chebotar 🔗
-	Representation Gap in Deep Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Qiang He · Huangyuan Su · Jieyu Zhang · Xinwen Hou 🔗
-	Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations ( Poster ) > link SlidesLive Video Link	Cong Lu · Philip Ball · Tim G. J Rudner · Jack Parker-Holder · Michael A Osborne · Yee-Whye Teh 🔗
-	Giving Feedback on Interactive Student Programs with Meta-Exploration ( Poster ) > link SlidesLive Video Link	Evan Liu · Moritz Stephan · Allen Nie · Chris Piech · Emma Brunskill · Chelsea Finn 🔗
-	When to Ask for Help: Proactive Interventions in Autonomous Reinforcement Learning ( Poster ) > link Link	Annie Xie · Fahim Tajwar · Archit Sharma · Chelsea Finn 🔗
-	Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions ( Poster ) > link SlidesLive Video Link	Audrey Huang · Nan Jiang 🔗
-	Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees ( Poster ) > link SlidesLive Video Link	Siliang Zeng · Chenliang Li · Alfredo Garcia · Mingyi Hong 🔗
-	You Can’t Count on Luck: Why Decision Transformers Fail in Stochastic Environments ( Poster ) > link SlidesLive Video Link	Keiran Paster · Sheila McIlraith · Jimmy Ba 🔗
-	Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games ( Poster ) > link SlidesLive Video Link	Dingyang Chen · Qi Zhang · Thinh Doan 🔗
-	Fast Convergence for Unstable Reinforcement Learning Problems by Logarithmic Mapping ( Poster ) > link SlidesLive Video Link	Wang Zhang · Lam Nguyen · Subhro Das · Alexandre Megretsky · Luca Daniel · Tsui-Wei Weng 🔗
-	Self-Referential Meta Learning ( Poster ) > link SlidesLive Video Link	Louis Kirsch · Jürgen Schmidhuber 🔗
-	Distributionally Adaptive Meta Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Anurag Ajay · Dibya Ghosh · Sergey Levine · Pulkit Agrawal · Abhishek Gupta 🔗
-	You Only Live Once: Single-Life Reinforcement Learning via Learned Reward Shaping ( Poster ) > link Link	Annie Chen · Archit Sharma · Sergey Levine · Chelsea Finn 🔗
-	Discovered Policy Optimisation ( Spotlight ) > link SlidesLive Video Link	Christopher Lu · Jakub Grudzien Kuba · Alistair Letcher · Luke Metz · Christian Schroeder · Jakob Foerster 🔗
-	Directed Exploration via Uncertainty-Aware Critics ( Poster ) > link Link	Amarildo Likmeta · Matteo Sacco · Alberto Maria Metelli · Marcello Restelli 🔗
-	Adversarial Cheap Talk ( Poster ) > link Link	Christopher Lu · Timon Willi · Alistair Letcher · Jakob Foerster 🔗
-	Adaptive Intrinsic Motivation with Decision Awareness ( Poster ) > link SlidesLive Video Link	Suyoung Lee · Sae-Young Chung 🔗
-	Leveraging Factored Action Spaces for Efficient Offline Reinforcement Learning in Healthcare ( Poster ) > link SlidesLive Video Link	Shengpu Tang · Maggie Makar · Michael Sjoding · Finale Doshi-Velez · Jenna Wiens 🔗
-	Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting ( Poster ) > link Link	Nicolai Dorka · Tim Welschehold · Wolfram Burgard 🔗
-	Task Factorization in Curriculum Learning ( Poster ) > link SlidesLive Video Link	Reuth Mirsky · Shahaf Shperberg · Yulin Zhang · Zifan Xu · Yuqian Jiang · Jiaxun Cui · Peter Stone 🔗
-	SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition ( Poster ) > link SlidesLive Video Link	Dylan Slack · Yinlam Chow · Bo Dai · Nevan Wichers 🔗
-	Guided Exploration in Reinforcement Learning via Monte Carlo Critic Optimization ( Poster ) > link SlidesLive Video Link	Igor Kuznetsov 🔗
-	Live in the Moment: Learning Dynamics Model Adapted to Evolving Policy ( Spotlight ) > link SlidesLive Video Link	xiyao wang · Wichayaporn Wongkamjan · Furong Huang 🔗
-	Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning ( Poster ) > link Link	Dilip Arumugam · Benjamin Van Roy 🔗
-	Generalization of Reinforcement Learning with Policy-Aware Adversarial Data Augmentation ( Poster ) > link SlidesLive Video Link	Hanping Zhang · Yuhong Guo 🔗
-	MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning ( Poster ) > link SlidesLive Video Link	Qiang He · Huangyuan Su · Chen GONG · Xinwen Hou 🔗