Timezone: »
Visual attention helps achieve robust perception under noise, corruption, and distribution shifts in human vision, which are areas where modern neural networks still fall short. We present VARS, Visual Attention from Recurrent Sparse reconstruction, a new attention formulation built on two prominent features of the human visual attention mechanism: recurrency and sparsity. Related features are grouped together via recurrent connections between neurons, with salient objects emerging via sparse regularization. VARS adopts an attractor network with recurrent connections that converges toward a stable pattern over time.Network layers are represented as ordinary differential equations (ODEs), formulating attention as a recurrent attractor network that equivalently optimizes the sparse reconstruction of input using a dictionary of ``templates'' encodingunderlying patterns of data. We show that self-attention is a special case of VARS with a single-step optimization and no sparsity constraint. VARS can be readily used as a replacement for self-attention in popular vision transformers, consistently improving their robustness across various benchmarks.
Author Information
Baifeng Shi (UC Berkeley)
Yale Song (Microsoft Research)
Neel Joshi (MICROSOFT RESEARCH)
Trevor Darrell (University of California at Berkeley)
Xin Wang (Microsoft Research, Redmond)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Visual Attention Emerges from Recurrent Sparse Reconstruction »
Thu. Jul 21st 06:35 -- 06:40 PM Room Hall F
More from the Same Authors
-
2021 : Explaining Reinforcement Learning Policies through Counterfactual Trajectories »
Julius Frost · Olivia Watkins · Eric Weiner · Pieter Abbeel · Trevor Darrell · Bryan Plummer · Kate Saenko -
2022 : Back to the Source: Test-Time Diffusion-Driven Adaptation »
Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang -
2022 Poster: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Spotlight: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2021 Workshop: ICML Workshop on Human in the Loop Learning (HILL) »
Trevor Darrell · Xin Wang · Li Erran Li · Fisher Yu · Zeynep Akata · Wenwu Zhu · Pradeep Ravikumar · Shiji Zhou · Shanghang Zhang · Kalesha Bullard -
2021 Poster: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2021 Spotlight: Compositional Video Synthesis with Action Graphs »
Amir Bar · Roi Herzig · Xiaolong Wang · Anna Rohrbach · Gal Chechik · Trevor Darrell · Amir Globerson -
2020 Workshop: 2nd ICML Workshop on Human in the Loop Learning (HILL) »
Shanghang Zhang · Xin Wang · Fisher Yu · Jiajun Wu · Trevor Darrell -
2020 Poster: Video Prediction via Example Guidance »
Jingwei Xu · Harry (Huazhe) Xu · Bingbing Ni · Xiaokang Yang · Trevor Darrell -
2020 Poster: Frustratingly Simple Few-Shot Object Detection »
Xin Wang · Thomas Huang · Joseph E Gonzalez · Trevor Darrell · Fisher Yu -
2019 : Fisher Yu: "Motion and Prediction for Autonomous Driving" »
Fisher Yu · Trevor Darrell -
2018 Poster: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2018 Oral: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2017 Poster: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell -
2017 Talk: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell