Timezone: »

 
Poster
Improving language models by retrieving from trillions of tokens
Sebastian Borgeaud · Arthur Mensch · Jordan Hoffmann · Trevor Cai · Eliza Rutherford · Katie Millican · George van den Driessche · Jean-Baptiste Lespiau · Bogdan Damoc · Aidan Clark · Diego de Las Casas · Aurelia Guy · Jacob Menick · Roman Ring · Tom Hennigan · Saffron Huang · Loren Maggiore · Chris Jones · Albin Cassirer · Andy Brock · Michela Paganini · Geoffrey Irving · Oriol Vinyals · Simon Osindero · Karen Simonyan · Jack Rae · Erich Elsen · Laurent Sifre

Tue Jul 19 03:30 PM -- 05:30 PM (PDT) @ Hall E #405

We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens. With a 2 trillion token database, our Retrieval-Enhanced Transformer (RETRO) obtains comparable performance to GPT-3 and Jurassic-1 on the Pile, despite using 25× fewer parameters. After fine-tuning, RETRO performance translates to downstream knowledge-intensive tasks such as question answering. RETRO combines a frozen Bert retriever, a differentiable encoder and a chunked cross-attention mechanism to predict tokens based on an order of magnitude more data than what is typically consumed during training. We typically train RETRO from scratch, yet can also rapidly RETROfit pre-trained transformers with retrieval and still achieve good performance. Our work opens up new avenues for improving language models through explicit memory at unprecedented scale.

Author Information

Sebastian Borgeaud (DeepMind)
Arthur Mensch (Deepmind)
Jordan Hoffmann (DeepMind)
Trevor Cai (DeepMind)
Eliza Rutherford (DeepMind)
Katie Millican (DeepMind)
George van den Driessche (DeepMind)
Jean-Baptiste Lespiau (DeepMind)
Bogdan Damoc (DeepMind)
Aidan Clark (OpenAI)
Diego de Las Casas (DeepMind)
Aurelia Guy (Google Inc.)
Jacob Menick (DeepMind)
Roman Ring (DeepMind)
Tom Hennigan (DeepMind)
Saffron Huang (DeepMind)
Loren Maggiore (DeepMind)
Chris Jones (DeepMind)
Albin Cassirer (DeepMind)
Andy Brock (DeepMind)
Michela Paganini (DeepMind)
Geoffrey Irving (DeepMind)
Oriol Vinyals (Google DeepMind)

Oriol Vinyals is a Research Scientist at Google. He works in deep learning with the Google Brain team. Oriol holds a Ph.D. in EECS from University of California, Berkeley, and a Masters degree from University of California, San Diego. He is a recipient of the 2011 Microsoft Research PhD Fellowship. He was an early adopter of the new deep learning wave at Berkeley, and in his thesis he focused on non-convex optimization and recurrent neural networks. At Google Brain he continues working on his areas of interest, which include artificial intelligence, with particular emphasis on machine learning, language, and vision.

Simon Osindero (DeepMind)
Karen Simonyan (Inflection AI)
Jack Rae (DeepMind)
Erich Elsen (Google)
Laurent Sifre (DeepMind)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors