Timezone: »
Social learning is a key component of human and animal intelligence. By taking cues from the behavior of experts in their environment, social learners can acquire sophisticated behavior and rapidly adapt to new circumstances. This paper investigates whether independent reinforcement learning (RL) agents in a multi-agent environment can learn to use social learning to improve their performance. We find that in most circumstances, vanilla model-free RL agents do not use social learning. We analyze the reasons for this deficiency, and show that by imposing constraints on the training environment and introducing a model-based auxiliary loss we are able to obtain generalized social learning policies which enable agents to: i) discover complex skills that are not learned from single-agent training, and ii) adapt online to novel environments by taking cues from experts present in the new environment. In contrast, agents trained with model-free RL or imitation learning generalize poorly and do not succeed in the transfer tasks. By mixing multi-agent and solo training, we can obtain agents that use social learning to gain skills that they can deploy when alone, even out-performing agents trained alone from the start.
Author Information
Kamal Ndousse (Anthropic)
Douglas Eck (Google Brain)
Sergey Levine (UC Berkeley)
Natasha Jaques (Google Brain, UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Emergent Social Learning via Multi-agent Reinforcement Learning »
Wed. Jul 21st 04:00 -- 06:00 AM Room
More from the Same Authors
-
2021 : Explore and Control with Adversarial Surprise »
Arnaud Fickinger · Natasha Jaques · Samyak Parajuli · Michael Chang · Nicholas Rhinehart · Glen Berseth · Stuart Russell · Sergey Levine -
2021 Poster: PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning »
Angelos Filos · Clare Lyle · Yarin Gal · Sergey Levine · Natasha Jaques · Gregory Farquhar -
2021 Oral: PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning »
Angelos Filos · Clare Lyle · Yarin Gal · Sergey Levine · Natasha Jaques · Gregory Farquhar -
2019 Poster: Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning »
Natasha Jaques · Angeliki Lazaridou · Edward Hughes · Caglar Gulcehre · Pedro Ortega · DJ Strouse · Joel Z Leibo · Nando de Freitas -
2019 Oral: Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning »
Natasha Jaques · Angeliki Lazaridou · Edward Hughes · Caglar Gulcehre · Pedro Ortega · DJ Strouse · Joel Z Leibo · Nando de Freitas -
2019 Poster: Learning to Groove with Inverse Sequence Transformations »
Jon Gillick · Adam Roberts · Jesse Engel · Douglas Eck · David Bamman -
2019 Oral: Learning to Groove with Inverse Sequence Transformations »
Jon Gillick · Adam Roberts · Jesse Engel · Douglas Eck · David Bamman -
2018 Poster: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music »
Adam Roberts · Jesse Engel · Colin Raffel · Curtis Hawthorne · Douglas Eck -
2018 Oral: A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music »
Adam Roberts · Jesse Engel · Colin Raffel · Curtis Hawthorne · Douglas Eck -
2017 Poster: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck -
2017 Poster: Online and Linear-Time Attention by Enforcing Monotonic Alignments »
Colin Raffel · Thang Luong · Peter Liu · Ron Weiss · Douglas Eck -
2017 Poster: Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders »
Cinjon Resnick · Adam Roberts · Jesse Engel · Douglas Eck · Sander Dieleman · Karen Simonyan · Mohammad Norouzi -
2017 Talk: Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders »
Cinjon Resnick · Adam Roberts · Jesse Engel · Douglas Eck · Sander Dieleman · Karen Simonyan · Mohammad Norouzi -
2017 Talk: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck -
2017 Talk: Online and Linear-Time Attention by Enforcing Monotonic Alignments »
Colin Raffel · Thang Luong · Peter Liu · Ron Weiss · Douglas Eck