Timezone: »
One of the key challenges in developing intelligent and autonomous learning agents is their ability to effectively interact with humans. In this tutorial, we plan to cover the theoretical and practical foundations of interactive agents. Specifically, in the first part of the tutorial, we will focus on models of human behavior in isolation, how these models can be used for effective coordination and how they can be optimized for influencing the partner. In the second part of the tutorial, we will continue by introducing co-adaptation settings, where the human preferences are non-stationary and they adapt, and we will discuss how this leads to emergence of new norms, conventions, and equilibria. Finally, we will wrap up by introducing approaches for inferring human partner preferences using a range of offline and online sources of data present in interactive domains. Throughout this tutorial, we will also go over concrete examples from applications in autonomous driving, mixed-autonomy traffic network, personal robotics, and multi-agent games.
Mon 10:00 a.m. - 10:35 a.m.
|
Learning objectives and preferences: WHAT DATA? From diverse types of human data
(
Talk
)
SlidesLive Video » |
Anca Dragan 🔗 |
Mon 10:35 a.m. - 11:00 a.m.
|
Learning objectives and preferences: HOW? Actively
(
Talk
)
SlidesLive Video » |
Dorsa Sadigh 🔗 |
Mon 11:00 a.m. - 11:05 a.m.
|
Q&A
SlidesLive Video » |
Dorsa Sadigh · Anca Dragan 🔗 |
Mon 11:05 a.m. - 11:15 a.m.
|
Break
|
🔗 |
Mon 11:15 a.m. - 11:25 a.m.
|
Learning to interact: GAME! Coordinating actions with humans via game theory
(
Talk
)
SlidesLive Video » |
Dorsa Sadigh 🔗 |
Mon 11:25 a.m. - 11:35 a.m.
|
Learning to interact: PARTIAL OBSERVABILITY The actions you take as part of the task are the queries!
(
Talk
)
SlidesLive Video » |
Anca Dragan 🔗 |
Mon 11:35 a.m. - 11:45 a.m.
|
Learning to interact: PARTIAL OBSERVABILITY + GAME Theory of mind on steroids
(
Talk
)
SlidesLive Video » |
Anca Dragan 🔗 |
Mon 11:45 a.m. - 12:00 p.m.
|
Learning to interact: LET’S LEARN IT ALL Implicit coordination though learned representations
(
Talk
)
SlidesLive Video » |
Dorsa Sadigh 🔗 |
Author Information
Dorsa Sadigh (Stanford University)
Anca Dragan (University of California, Berkeley)
More from the Same Authors
-
2022 : A Study of Causal Confusion in Preference-Based Reward Learning »
Jeremy Tien · Zhiyang He · Zackory Erickson · Anca Dragan · Daniel S Brown -
2023 : Preventing Reward Hacking with Occupancy Measure Regularization »
Cassidy Laidlaw · Shivam Singhal · Anca Dragan -
2023 : Preventing Reward Hacking with Occupancy Measure Regularization »
Cassidy Laidlaw · Shivam Singhal · Anca Dragan -
2023 : Parallel Sampling of Diffusion Models »
Andy Shih · Suneel Belkhale · Stefano Ermon · Dorsa Sadigh · Nima Anari -
2023 : Parallel Sampling of Diffusion Models »
Andy Shih · Suneel Belkhale · Stefano Ermon · Dorsa Sadigh · Nima Anari -
2023 : Video-Guided Skill Discovery »
Manan Tomar · Dibya Ghosh · Vivek Myers · Anca Dragan · Matthew Taylor · Philip Bachman · Sergey Levine -
2023 : Inverse Preference Learning: Preference-based RL without a Reward Function »
Joey Hejna · Dorsa Sadigh -
2023 Workshop: Interactive Learning with Implicit Human Feedback »
Andi Peng · Akanksha Saran · Andreea Bobu · Tengyang Xie · Pierre-Yves Oudeyer · Anca Dragan · John Langford -
2023 : Bridging RL Theory and Practice with the Effective Horizon »
Cassidy Laidlaw · Stuart Russell · Anca Dragan -
2023 : Learning Optimal Advantage from Preferences and Mistaking it for Reward »
William Knox · Stephane Hatgis-Kessell · Sigurdur Adalgeirsson · Serena Booth · Anca Dragan · Peter Stone · Scott Niekum -
2023 : Aligning Robots with Human Preferences »
Dorsa Sadigh -
2023 Poster: Generating Language Corrections for Teaching Physical Control Tasks »
Megha Srivastava · Noah Goodman · Dorsa Sadigh -
2023 Poster: Distance Weighted Supervised Learning for Offline Interaction Data »
Joey Hejna · Jensen Gao · Dorsa Sadigh -
2023 Poster: Contextual Reliability: When Different Features Matter in Different Contexts »
Gaurav Ghosal · Amrith Setlur · Daniel S Brown · Anca Dragan · Aditi Raghunathan -
2023 Poster: Long Horizon Temperature Scaling »
Andy Shih · Dorsa Sadigh · Stefano Ermon -
2023 Poster: Automatically Auditing Large Language Models via Discrete Optimization »
Erik Jones · Anca Dragan · Aditi Raghunathan · Jacob Steinhardt -
2023 Poster: Language Instructed Reinforcement Learning for Human-AI Coordination »
Hengyuan Hu · Dorsa Sadigh -
2022 Poster: Imitation Learning by Estimating Expertise of Demonstrators »
Mark Beliaev · Andy Shih · Stefano Ermon · Dorsa Sadigh · Ramtin Pedarsani -
2022 Spotlight: Imitation Learning by Estimating Expertise of Demonstrators »
Mark Beliaev · Andy Shih · Stefano Ermon · Dorsa Sadigh · Ramtin Pedarsani -
2022 Poster: Estimating and Penalizing Induced Preference Shifts in Recommender Systems »
Micah Carroll · Anca Dragan · Stuart Russell · Dylan Hadfield-Menell -
2022 Spotlight: Estimating and Penalizing Induced Preference Shifts in Recommender Systems »
Micah Carroll · Anca Dragan · Stuart Russell · Dylan Hadfield-Menell -
2022 : Learning to interact: LET’S LEARN IT ALL Implicit coordination though learned representations »
Dorsa Sadigh -
2022 : Learning to interact: PARTIAL OBSERVABILITY + GAME Theory of mind on steroids »
Anca Dragan -
2022 : Learning to interact: PARTIAL OBSERVABILITY The actions you take as part of the task are the queries! »
Anca Dragan -
2022 : Learning to interact: GAME! Coordinating actions with humans via game theory »
Dorsa Sadigh -
2022 : Q&A »
Dorsa Sadigh · Anca Dragan -
2022 : Learning objectives and preferences: HOW? Actively »
Dorsa Sadigh -
2022 : Learning objectives and preferences: WHAT DATA? From diverse types of human data »
Anca Dragan -
2021 : The Role of Conventions in Adaptive Human-AI Collaboration »
Dorsa Sadigh -
2021 Poster: Policy Gradient Bayesian Robust Optimization for Imitation Learning »
Zaynah Javed · Daniel Brown · Satvik Sharma · Jerry Zhu · Ashwin Balakrishna · Marek Petrik · Anca Dragan · Ken Goldberg -
2021 Spotlight: Policy Gradient Bayesian Robust Optimization for Imitation Learning »
Zaynah Javed · Daniel Brown · Satvik Sharma · Jerry Zhu · Ashwin Balakrishna · Marek Petrik · Anca Dragan · Ken Goldberg -
2021 Poster: Targeted Data Acquisition for Evolving Negotiation Agents »
Minae Kwon · Siddharth Karamcheti · Mariano-Florentino Cuellar · Dorsa Sadigh -
2021 Poster: Value Alignment Verification »
Daniel Brown · Jordan Schneider · Anca Dragan · Scott Niekum -
2021 Spotlight: Targeted Data Acquisition for Evolving Negotiation Agents »
Minae Kwon · Siddharth Karamcheti · Mariano-Florentino Cuellar · Dorsa Sadigh -
2021 Spotlight: Value Alignment Verification »
Daniel Brown · Jordan Schneider · Anca Dragan · Scott Niekum -
2020 : Invited Talk 7: Prof. Anca Dragan from UC Berkeley »
Anca Dragan -
2020 : "Active Learning through Physically-embodied, Synthesized-from-“scratch” Queries" »
Anca Dragan -
2020 : "Active Learning of Robot Reward Functions" »
Dorsa Sadigh -
2020 Poster: Learning Human Objectives by Evaluating Hypothetical Behavior »
Siddharth Reddy · Anca Dragan · Sergey Levine · Shane Legg · Jan Leike -
2019 : Dorsa Sadigh: "Influencing Interactive Mixed-Autonomy Systems" »
Dorsa Sadigh -
2019 Poster: On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference »
Rohin Shah · Noah Gundotra · Pieter Abbeel · Anca Dragan -
2019 Oral: On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference »
Rohin Shah · Noah Gundotra · Pieter Abbeel · Anca Dragan -
2019 Poster: Learning a Prior over Intent via Meta-Inverse Reinforcement Learning »
Kelvin Xu · Ellis Ratner · Anca Dragan · Sergey Levine · Chelsea Finn -
2019 Oral: Learning a Prior over Intent via Meta-Inverse Reinforcement Learning »
Kelvin Xu · Ellis Ratner · Anca Dragan · Sergey Levine · Chelsea Finn -
2018 Poster: An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning »
Dhruv Malik · Malayandi Palaniappan · Jaime Fisac · Dylan Hadfield-Menell · Stuart Russell · Anca Dragan -
2018 Oral: An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning »
Dhruv Malik · Malayandi Palaniappan · Jaime Fisac · Dylan Hadfield-Menell · Stuart Russell · Anca Dragan