Timezone: »

 
Uncertainty Modeling from 50M to 1B
Dustin Tran

Fri Jul 23 06:15 AM -- 06:45 AM (PDT) @
Event URL: https://docs.google.com/presentation/d/12s5IzVYqfALV9pjZswOB3De6ec7_2pqLrHPSiHjjUQc/edit?usp=sharing »

I'll talk about one specific problem I have with the field: scale. Many papers fix an architecture and try to improve log-likelihood, comparing to the original base architecture regardless of how much additional compute is used to outperform the original model. Yet, if we adjust for scale—for example, compare an ensemble of size 10 to a model scaled up 10x—we'd see improvements significantly diminish or vanish altogether. Ultimately, we should be examining the frontier of uncertainty-robustness performance as a function of compute. I'll substantiate this perspective with a few works with colleagues. These works advance the frontier with efficient ensembles alongside priors and inductive biases; and we'll examine uncertainty properties of existing giant models.

Author Information

Dustin Tran (Google)

More from the Same Authors

  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · Jie Ren · Joost van Amersfoort · Kehang Han · E. Kelly Buchanan · Kevin Murphy · Mark Collier · Mike Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2023 Poster: Scaling Vision Transformers to 22 Billion Parameters »
    Mostafa Dehghani · Josip Djolonga · Basil Mustafa · Piotr Padlewski · Jonathan Heek · Justin Gilmer · Andreas Steiner · Mathilde Caron · Robert Geirhos · Ibrahim Alabdulmohsin · Rodolphe Jenatton · Lucas Beyer · Michael Tschannen · Anurag Arnab · Xiao Wang · Carlos Riquelme · Matthias Minderer · Joan Puigcerver · Utku Evci · Manoj Kumar · Sjoerd van Steenkiste · Gamaleldin Elsayed · Aravindh Mahendran · Fisher Yu · Avital Oliver · Fantine Huot · Jasmijn Bastings · Mark Collier · Alexey Gritsenko · Vighnesh N Birodkar · Cristina Vasconcelos · Yi Tay · Thomas Mensink · Alexander Kolesnikov · Filip Pavetic · Dustin Tran · Thomas Kipf · Mario Lucic · Xiaohua Zhai · Daniel Keysers · Jeremiah Harmsen · Neil Houlsby
  • 2023 Poster: A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models »
    James Allingham · JIE REN · Michael Dusenberry · Xiuye Gu · Yin Cui · Dustin Tran · Jeremiah Liu · Balaji Lakshminarayanan
  • 2023 Oral: Scaling Vision Transformers to 22 Billion Parameters »
    Mostafa Dehghani · Josip Djolonga · Basil Mustafa · Piotr Padlewski · Jonathan Heek · Justin Gilmer · Andreas Steiner · Mathilde Caron · Robert Geirhos · Ibrahim Alabdulmohsin · Rodolphe Jenatton · Lucas Beyer · Michael Tschannen · Anurag Arnab · Xiao Wang · Carlos Riquelme · Matthias Minderer · Joan Puigcerver · Utku Evci · Manoj Kumar · Sjoerd van Steenkiste · Gamaleldin Elsayed · Aravindh Mahendran · Fisher Yu · Avital Oliver · Fantine Huot · Jasmijn Bastings · Mark Collier · Alexey Gritsenko · Vighnesh N Birodkar · Cristina Vasconcelos · Yi Tay · Thomas Mensink · Alexander Kolesnikov · Filip Pavetic · Dustin Tran · Thomas Kipf · Mario Lucic · Xiaohua Zhai · Daniel Keysers · Jeremiah Harmsen · Neil Houlsby
  • 2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
    Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani
  • 2020 Poster: Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors »
    Mike Dusenberry · Ghassen Jerfel · Yeming Wen · Yian Ma · Jasper Snoek · Katherine Heller · Balaji Lakshminarayanan · Dustin Tran
  • 2018 Poster: Image Transformer »
    Niki Parmar · Ashish Vaswani · Jakob Uszkoreit · Lukasz Kaiser · Noam Shazeer · Alexander Ku · Dustin Tran
  • 2018 Oral: Image Transformer »
    Niki Parmar · Ashish Vaswani · Jakob Uszkoreit · Lukasz Kaiser · Noam Shazeer · Alexander Ku · Dustin Tran