Oral
Area Attention
Yang Li · Lukasz Kaiser · Samy Bengio · Si Si

Tue Jun 11th 05:00 -- 05:05 PM @ Hall A

Existing attention mechanisms are trained to attend to individual items in a collection (the memory) with a predefined, fixed granularity, e.g., a word token or an image grid. We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e.g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences. Importantly, the shape and the size of an area are dynamically determined via learning, which enables a model to attend to information with varying granularity. Area attention can easily work with existing model architectures such as multi-head attention for simultaneously attending to multiple areas in the memory. We evaluate area attention on two tasks: neural machine translation (both character and token-level) and image captioning, and improve upon strong (state-of-the-art) baselines in all the cases. These improvements are obtainable with a basic form of area attention that is parameter free.

Author Information

Yang Li (Google Research)
Lukasz Kaiser (Google)
Samy Bengio (Google Research Brain Team)
Si Si (Google Research)

Related Events (a corresponding poster, oral, or spotlight)

  • 2019 Poster: Area Attention »
    Tue Jun 11th 06:30 -- 09:00 PM Room Pacific Ballroom

More from the Same Authors