Timezone: »
In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations with pretrained token embeddings. To form these associations, a modern Hopfield network stores these token embeddings, which are retrieved by queries that are obtained by a random but fixed projection of observations. Our new method, HELM, enables actor-critic network architectures that contain a pretrained language Transformer for history representation as a memory module. Since a representation of the past need not be learned, HELM is much more sample efficient than competitors. On Minigrid and Procgen environments HELM achieves new state-of-the-art results. Our code is available at https://github.com/ml-jku/helm.
Author Information
Fabian Paischer (Johannes Kepler University Linz)
Thomas Adler (LIT AI Lab / JKU Linz)
Vihang Patil (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria)
Angela Bitto-Nemling (JKU)
Markus Holzleitner (LIT AI Lab / University Linz)
Sebastian Lehner (JKU Linz)
Hamid Eghbal-zadeh (Meta)
Sepp Hochreiter (ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Institute for Advanced Research in Artificial Intelligence (IARAI))

Sepp Hochreiter is heading the Institute for Machine Learning, the ELLIS Unit Linz, the LIT AI Lab at the JKU Linz and is director of private research institute IARAI. He is a pioneer of Deep Learning as he discovered the famous problem of vanishing or exploding gradients and invented the long short-term memory (LSTM).
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: History Compression via Language Models in Reinforcement Learning »
Wed. Jul 20th through Thu the 21st Room Hall E #828
More from the Same Authors
-
2022 : [Poster] Mimicking Iterative Learning with Modern Hopfield Networks for Tabular Data »
Angela Bitto-Nemling -
2023 Poster: Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language »
Philipp Seidl · Andreu Vall · Sepp Hochreiter · Günter Klambauer -
2022 Poster: Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution »
Vihang Patil · Markus Hofmarcher · Marius-Constantin Dinu · Matthias Dorfer · Patrick Blies · Johannes Brandstetter · Jose A. Arjona-Medina · Sepp Hochreiter -
2022 Oral: Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution »
Vihang Patil · Markus Hofmarcher · Marius-Constantin Dinu · Matthias Dorfer · Patrick Blies · Johannes Brandstetter · Jose A. Arjona-Medina · Sepp Hochreiter -
2022 : Poster Session 2 »
Asra Aslam · Sowmya Vijayakumar · Heta Gandhi · Mary Adewunmi · You Cheng · Tong Yang · Kristina Ulicna · · Weiwei Zong · Narmada Naik · Akshata Tiwari · Ambreen Hamadani · Mayuree Binjolkar · Charupriya Sharma · Chhavi Yadav · Yu Yang · Winnie Xu · QINGQING ZHAO · Julissa Giuliana Villanueva Llerena · Lilian Mkonyi · Berthine Nyunga Mpinda · Rehema Mwawado · Tooba Imtiaz · Desi Ivanova · Emma Johanna Mikaela Petersson Svensson · Angela Bitto-Nemling · Elisabeth Rumetshofer · Ana Sanchez Fernandez · Garima Giri · Sigrid Passano Hellan · Catherine Ordun · Vasiliki Tassopoulou · Gina Wong -
2022 : Poster Session 1 »
Asra Aslam · Sowmya Vijayakumar · Heta Gandhi · Mary Adewunmi · You Cheng · Tong Yang · Kristina Ulicna · · Weiwei Zong · Narmada Naik · Akshata Tiwari · Ambreen Hamadani · Mayuree Binjolkar · Charupriya Sharma · Chhavi Yadav · Yu Yang · Winnie Xu · QINGQING ZHAO · Julissa Giuliana Villanueva Llerena · Lilian Mkonyi · Berthine Nyunga Mpinda · Rehema Mwawado · Tooba Imtiaz · Desi Ivanova · Emma Johanna Mikaela Petersson Svensson · Angela Bitto-Nemling · Elisabeth Rumetshofer · Ana Sanchez Fernandez · Garima Giri · Sigrid Passano Hellan · Catherine Ordun · Vasiliki Tassopoulou · Gina Wong -
2021 Spotlight: MC-LSTM: Mass-Conserving LSTM »
Pieter-Jan Hoedt · Frederik Kratzert · Daniel Klotz · Christina Halmich · Markus Holzleitner · Grey Nearing · Sepp Hochreiter · Günter Klambauer -
2021 Poster: MC-LSTM: Mass-Conserving LSTM »
Pieter-Jan Hoedt · Frederik Kratzert · Daniel Klotz · Christina Halmich · Markus Holzleitner · Grey Nearing · Sepp Hochreiter · Günter Klambauer