Tutorial
A Tutorial on Attention in Deep Learning
Alex Smola · Aston Zhang

Mon Jun 10th 03:45 -- 06:00 PM @ Hall A
Event URL: https://www.d2l.ai »

Attention is a key mechanism to enable nonparametric models in deep learning. Quite arguably it is the basis of most recent progress in deep learning models. Beyond its introduction in neural machine translation, it can be traced back to neuroscience. It was arguably introduced via the gating or forgetting mechanism of LSTMs. Over the past 5 years attention has been key to advancing the state of the art in areas as diverse as natural language processing, computer vision, speech recognition, image synthesis, solving traveling salesman problems, or reinforcement learning. This tutorial offers a coherent overview over various types of attention; efficient implementation using Jupyter notebooks which allow the audience a hands-on experience to replicate and apply attention mechanisms; and a textbook (www.d2l.ai) to allow the audience to dive more deeply into the underlying theory.

Author Information

Alex Smola (Amazon)
Aston Zhang (AWS AI)
Aston Zhang

Aston Zhang is an applied scientist at Amazon Web Services AI. His research interests are in deep learning. He received a Ph.D. in computer science from University of Illinois at Urbana-Champaign. He has served as an editorial board member for Frontiers in Big Data and a program committee member (reviewer) for ICML, NeurIPS, WWW, KDD, SIGIR, and WSDM. His book Dive into Deep Learning (www.d2l.ai) was taught at UC Berkeley in Spring 2019 and has been used as a textbook worldwide.

More from the Same Authors