Timezone: »

 
Confident Adaptive Language Modeling
Tal Schuster · Adam Fisch · Jai Gupta · Mostafa Dehghani · Dara Bahri · Vinh Tran · Yi Tay · Don Metzler

Fri Jul 22 12:15 PM -- 01:15 PM (PDT) @

We introduce Confident Adaptive Language Modeling (CALM), a framework for dynamically allocating different amounts of compute per input and generation timestep. Early exit decoding involves several challenges that we address here, such as: (1) what confidence measure to use; (2) connecting sequence-level constraints to local per-token exit decisions; and (3) attending back to missing hidden representations due to early exits in previous tokens. Through theoretical analysis and empirical experiments on three diverse text generation tasks, we demonstrate the efficacy of our framework in reducing compute---potential speedup of up to X3---while provably maintaining high performance.

Author Information

Tal Schuster (Google)
Adam Fisch (MIT)
Jai Gupta (Google)
Mostafa Dehghani (Google Brain)
Dara Bahri (Google Research)
Vinh Tran (Google)
Yi Tay (Google)
Don Metzler (Google)

More from the Same Authors