Timezone: »

Pretrained Encoders are All You Need
Mina Khan · Advait Rane · Srivatsa P · Shriram Chenniappa · Rishabh Anand · Sherjil Ozair · Patricia Maes

Data-efficiency and generalization are key challenges in deep learning and deep reinforcement learning as many models are trained on large-scale, domain-specific, and expensive-to-label datasets. Self-supervised models trained on large-scale uncurated datasets have shown successful transfer to diverse settings. We investigate using pretrained image representations and spatio-temporal attention for state representation learning in Atari. We also explore fine-tuning pretrained representations with self-supervised techniques, i.e., contrastive predictive coding, spatio-temporal contrastive learning, and augmentations. Our results show that pretrained representations are at par with state-of-the-art self-supervised methods trained on domain-specific data. Pretrained representations, thus, yield data and compute-efficient state representations.

Author Information

Mina Khan (MIT)
Advait Rane (BITS Pilani, Goa)
Srivatsa P (Massachusetts Institute of Technology)
Shriram Chenniappa (BITS Pilani, Goa)
Rishabh Anand (National University of Singapore)
Sherjil Ozair (DeepMind)
Patricia Maes (MIT Media Lab)

More from the Same Authors