Timezone: »

 
Three Towers: Flexible Contrastive Learning with Pretrained Image Models
Jannik Kossen · Mark Collier · Basil Mustafa · Xiao Wang · Xiaohua Zhai · Lucas Beyer · Andreas Steiner · Jesse Berent · Rodolphe Jenatton · Efi Kokiopoulou
Event URL: https://openreview.net/forum?id=6nKjdEHDDU »

We introduce Three Towers (3T), a flexible method to improve the contrastive learning of vision-language models by incorporating pretrained image classifiers. While contrastive models are usually trained from scratch, LiT (Zhai et al., 2022) has recently shown performance gains from using pretrained classifier embeddings. However, LiT directly replaces the image tower with the frozen embeddings, excluding any potential benefits of contrastively training the image tower. With 3T, we propose a more flexible strategy that allows the image tower to benefit from both pretrained embeddings and contrastive training. To achieve this, we introduce a third tower that contains the frozen pretrained embeddings, and we encourage alignment between this third tower and the main image-text towers. Empirically, 3T consistently improves over LiT and the CLIP-style from-scratch baseline for retrieval tasks. For classification, 3T reliably improves over the from-scratch baseline, and while it underperforms relative to LiT for JFT-pretrained models, it outperforms LiT for ImageNet-21k and Places365 pretraining.

Author Information

Jannik Kossen (University of Oxford)
Mark Collier (Google)
Basil Mustafa (Google)
Xiao Wang (Google)
Xiaohua Zhai (Google Brain)
Lucas Beyer (Google Brain (Zürich))
Andreas Steiner (Google)
Andreas Steiner

Computer vision research engineer at Google DeepMind. Previously worked in tropical medicine. Education background MD, bioelectronics.

Jesse Berent (Google)
Rodolphe Jenatton (Google Research)
Efi Kokiopoulou (Google AI)

Efi is a research scientist at Google since February 2013. She joined Google as a PostDoc researcher in September 2011. Before that she was a postdoctoral research fellow at the Seminar for Applied Mathematics (SAM) at ETH, Zurich. She completed her PhD studies in December 2008 at the Signal Processing Laboratory (LTS4) of the Swiss Federal Institute of Technology (EPFL), Lausanne under the supervision of Prof. Pascal Frossard. Before that she was with the Computer Science & Engineering Department of the University of Minnesota, USA, where she obtained in June 2005 her M.Sc. degree under the supervision of Prof. Yousef Saad. She obtained B.Eng. and MscEng. degrees in 2002 and 2003 respectively at the Computer Engineering and Informatics Department of the University of Patras, Greece.

More from the Same Authors

  • 2022 : SI-Score »
    Jessica Yung · Rob Romijnders · Alexander Kolesnikov · Lucas Beyer · Josip Djolonga · Neil Houlsby · Sylvain Gelly · Mario Lucic · Xiaohua Zhai
  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · Jie Ren · Joost van Amersfoort · Kehang Han · E. Kelly Buchanan · Kevin Murphy · Mark Collier · Mike Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2023 Poster: Underspecification Presents Challenges for Credibility in Modern Machine Learning »
    Alexander D'Amour · Katherine Heller · Dan Moldovan · Ben Adlam · Babak Alipanahi · Alex Beutel · Christina Chen · Jonathan Deaton · Jacob Eisenstein · Matthew Hoffman · Farhad Hormozdiari · Neil Houlsby · Shaobo Hou · Ghassen Jerfel · Alan Karthikesalingam · Mario Lucic · Yian Ma · Cory McLean · Diana Mincu · Akinori Mitani · Andrea Montanari · Zachary Nado · Vivek Natarajan · Christopher Nielson · Thomas F. Osborne · Rajiv Raman · Kim Ramasamy · Rory sayres · Jessica Schrouff · Martin Seneviratne · Shannon Sequeira · Harini Suresh · Victor Veitch · Maksym Vladymyrov · Xuezhi Wang · Kellie Webster · Steve Yadlowsky · Taedong Yun · Xiaohua Zhai · D. Sculley
  • 2023 Poster: Tuning Computer Vision Models With Task Rewards »
    André Susano Pinto · Alexander Kolesnikov · Yuge Shi · Lucas Beyer · Xiaohua Zhai
  • 2023 Poster: Scaling Vision Transformers to 22 Billion Parameters »
    Mostafa Dehghani · Josip Djolonga · Basil Mustafa · Piotr Padlewski · Jonathan Heek · Justin Gilmer · Andreas Steiner · Mathilde Caron · Robert Geirhos · Ibrahim Alabdulmohsin · Rodolphe Jenatton · Lucas Beyer · Michael Tschannen · Anurag Arnab · Xiao Wang · Carlos Riquelme · Matthias Minderer · Joan Puigcerver · Utku Evci · Manoj Kumar · Sjoerd van Steenkiste · Gamaleldin Elsayed · Aravindh Mahendran · Fisher Yu · Avital Oliver · Fantine Huot · Jasmijn Bastings · Mark Collier · Alexey Gritsenko · Vighnesh N Birodkar · Cristina Vasconcelos · Yi Tay · Thomas Mensink · Alexander Kolesnikov · Filip Pavetic · Dustin Tran · Thomas Kipf · Mario Lucic · Xiaohua Zhai · Daniel Keysers · Jeremiah Harmsen · Neil Houlsby
  • 2023 Poster: When does Privileged information Explain Away Label Noise? »
    Guillermo Ortiz Jimenez · Mark Collier · Anant Nawalgaria · Alexander D'Amour · Jesse Berent · Rodolphe Jenatton · Efi Kokiopoulou
  • 2023 Oral: Scaling Vision Transformers to 22 Billion Parameters »
    Mostafa Dehghani · Josip Djolonga · Basil Mustafa · Piotr Padlewski · Jonathan Heek · Justin Gilmer · Andreas Steiner · Mathilde Caron · Robert Geirhos · Ibrahim Alabdulmohsin · Rodolphe Jenatton · Lucas Beyer · Michael Tschannen · Anurag Arnab · Xiao Wang · Carlos Riquelme · Matthias Minderer · Joan Puigcerver · Utku Evci · Manoj Kumar · Sjoerd van Steenkiste · Gamaleldin Elsayed · Aravindh Mahendran · Fisher Yu · Avital Oliver · Fantine Huot · Jasmijn Bastings · Mark Collier · Alexey Gritsenko · Vighnesh N Birodkar · Cristina Vasconcelos · Yi Tay · Thomas Mensink · Alexander Kolesnikov · Filip Pavetic · Dustin Tran · Thomas Kipf · Mario Lucic · Xiaohua Zhai · Daniel Keysers · Jeremiah Harmsen · Neil Houlsby
  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2022 : SI-Score »
    Jessica Yung · Rob Romijnders · Alexander Kolesnikov · Lucas Beyer · Josip Djolonga · Neil Houlsby · Sylvain Gelly · Mario Lucic · Xiaohua Zhai
  • 2022 Poster: Transfer and Marginalize: Explaining Away Label Noise with Privileged Information »
    Mark Collier · Rodolphe Jenatton · Efi Kokiopoulou · Jesse Berent
  • 2022 Spotlight: Transfer and Marginalize: Explaining Away Label Noise with Privileged Information »
    Mark Collier · Rodolphe Jenatton · Efi Kokiopoulou · Jesse Berent
  • 2021 Poster: Active Testing: Sample-Efficient Model Evaluation »
    Jannik Kossen · Sebastian Farquhar · Yarin Gal · Tom Rainforth
  • 2021 Spotlight: Active Testing: Sample-Efficient Model Evaluation »
    Jannik Kossen · Sebastian Farquhar · Yarin Gal · Tom Rainforth
  • 2020 Poster: The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks »
    Jakub Swiatkowski · Kevin Roth · Bastiaan Veeling · Linh Tran · Joshua V Dillon · Jasper Snoek · Stephan Mandt · Tim Salimans · Rodolphe Jenatton · Sebastian Nowozin
  • 2020 Poster: How Good is the Bayes Posterior in Deep Neural Networks Really? »
    Florian Wenzel · Kevin Roth · Bastiaan Veeling · Jakub Swiatkowski · Linh Tran · Stephan Mandt · Jasper Snoek · Tim Salimans · Rodolphe Jenatton · Sebastian Nowozin
  • 2019 Poster: A Large-Scale Study on Regularization and Normalization in GANs »
    Karol Kurach · Mario Lucic · Xiaohua Zhai · Marcin Michalski · Sylvain Gelly
  • 2019 Oral: A Large-Scale Study on Regularization and Normalization in GANs »
    Karol Kurach · Mario Lucic · Xiaohua Zhai · Marcin Michalski · Sylvain Gelly
  • 2019 Poster: High-Fidelity Image Generation With Fewer Labels »
    Mario Lucic · Michael Tschannen · Marvin Ritter · Xiaohua Zhai · Olivier Bachem · Sylvain Gelly
  • 2019 Oral: High-Fidelity Image Generation With Fewer Labels »
    Mario Lucic · Michael Tschannen · Marvin Ritter · Xiaohua Zhai · Olivier Bachem · Sylvain Gelly