Timezone: »

Uncertainty Modeling from 50M to 1B
Dustin Tran

Fri Jul 23 06:15 AM -- 06:45 AM (PDT) @ None
Event URL: https://docs.google.com/presentation/d/12s5IzVYqfALV9pjZswOB3De6ec7_2pqLrHPSiHjjUQc/edit?usp=sharing »

I'll talk about one specific problem I have with the field: scale. Many papers fix an architecture and try to improve log-likelihood, comparing to the original base architecture regardless of how much additional compute is used to outperform the original model. Yet, if we adjust for scale—for example, compare an ensemble of size 10 to a model scaled up 10x—we'd see improvements significantly diminish or vanish altogether. Ultimately, we should be examining the frontier of uncertainty-robustness performance as a function of compute. I'll substantiate this perspective with a few works with colleagues. These works advance the frontier with efficient ensembles alongside priors and inductive biases; and we'll examine uncertainty properties of existing giant models.

Author Information

Dustin Tran (Google)

More from the Same Authors