Timezone: »
Poster
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
Ashley Edwards · Himanshu Sahni · Rosanne Liu · Jane Hung · Ankit Jain · Rui Wang · Adrien Ecoffet · Thomas Miconi · Charles Isbell · Jason Yosinski
Thu Jul 16 06:00 AM -- 06:45 AM & Thu Jul 16 05:00 PM -- 05:45 PM (PDT) @
In this paper, we introduce a novel form of value function, $Q(s, s')$, that expresses the utility of transitioning from a state $s$ to a neighboring state $s'$ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at http://sites.google.com/view/qss-paper.
Author Information
Ashley Edwards (Uber AI)
Himanshu Sahni (Georgia Institute of Technology)
Rosanne Liu (ML Collective)
Jane Hung (Uber)
Ankit Jain (Uber AI)
Rui Wang (Uber AI)
Adrien Ecoffet (OpenAI)
Thomas Miconi (Uber AI Labs)
Charles Isbell (Georgia Institute of Technology)
Jason Yosinski (Deep Collective)
More from the Same Authors
-
2021 : When does loss-based prioritization fail? »
Niel Hu · Xinyu Hu · Rosanne Liu · Sara Hooker · Jason Yosinski -
2023 Poster: Learning to acquire novel cognitive tasks with evolution, plasticity and meta-meta-learning »
Thomas Miconi -
2022 : Invited Talk #2 - Collaborations with ML researchers »
Rosanne Liu -
2021 Workshop: Workshop on Computational Approaches to Mental Health @ ICML 2021 »
Niranjani Prasad · Caroline Weis · Shems Saleh · Rosanne Liu · Jake Vasilakes · Agni Kumar · Tianlin Zhang · Ida Momennejad · Danielle Belgrave -
2021 Social: Open Collaboration in ML Research »
Brenda Ng · Alexander Gu · Jason Yosinski · Rosanne Liu · Luis Granados · Suzana Ilic -
2021 Poster: Reinforcement Learning Under Moral Uncertainty »
Adrien Ecoffet · Joel Lehman -
2021 Spotlight: Reinforcement Learning Under Moral Uncertainty »
Adrien Ecoffet · Joel Lehman -
2020 : Brainstorming & Closing »
Mayoore Jaiswal · Ryan Lowe · Jesse Dodge · Jessica Forde · Rosanne Liu -
2020 : Q&A: Pascale Fung »
Pascale FUNG · Rosanne Liu -
2020 Workshop: MLRetrospectives: A Venue for Self-Reflection in ML Research »
Jessica Forde · Jesse Dodge · Mayoore Jaiswal · Rosanne Liu · Ryan Lowe · Rosanne Liu · Joelle Pineau · Yoshua Bengio -
2020 Poster: Enhanced POET: Open-ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions »
Rui Wang · Joel Lehman · Aditya Rawal · Jiale Zhi · Yulun Li · Jeffrey Clune · Kenneth Stanley -
2019 : posters »
Zhengxing Chen · Juan Jose Garau Luis · Ignacio Albert Smet · Aditya Modi · Sabina Tomkins · Riley Simmons-Edler · Hongzi Mao · Alexander Irpan · Hao Lu · Rose Wang · Subhojyoti Mukherjee · Aniruddh Raghu · Syed Arbab Mohd Shihab · Byung Hoon Ahn · Rasool Fakoor · Pratik Chaudhari · Elena Smirnova · Min-hwan Oh · Xiaocheng Tang · Tony Qin · Qingyang Li · Marc Brittain · Ian Fox · Supratik Paul · Xiaofeng Gao · Yinlam Chow · Gabriel Dulac-Arnold · Ofir Nachum · Nikos Karampatziakis · Bharathan Balaji · Supratik Paul · Ali Davody · Djallel Bouneffouf · Himanshu Sahni · Soo Kim · Andrey Kolobov · Alexander Amini · Yao Liu · Xinshi Chen · · Craig Boutilier -
2019 Poster: Metropolis-Hastings Generative Adversarial Networks »
Ryan Turner · Jane Hung · Eric Frank · Yunus Saatchi · Jason Yosinski -
2019 Oral: Metropolis-Hastings Generative Adversarial Networks »
Ryan Turner · Jane Hung · Eric Frank · Yunus Saatchi · Jason Yosinski -
2019 Poster: Imitating Latent Policies from Observation »
Ashley Edwards · Himanshu Sahni · Yannick Schroecker · Charles Isbell -
2019 Oral: Imitating Latent Policies from Observation »
Ashley Edwards · Himanshu Sahni · Yannick Schroecker · Charles Isbell