Timezone: »
The gradient of a deep neural network (DNN) w.r.t. the input provides information that can be used to explain the output prediction in terms of the input features and has been widely studied to assist in interpreting DNNs. In a linear model (i.e., g(x) = wx + b), the gradient corresponds to the weights w. Such a model can reasonably locallylinearly approximate a smooth nonlinear DNN, and hence the weights of this local model are the gradient. The bias b, however, is usually overlooked in attribution methods. In this paper, we observe that since the bias in a DNN also has a nonnegligible contribution to the correctness of predictions, it can also play a significant role in understanding DNN behavior. We propose a backpropagationtype algorithm “bias backpropagation (BBp)” that starts at the output layer and iteratively attributes the bias of each layer to its input nodes as well as combining the resulting bias term of the previous layer. Together with the backpropagation of the gradient generating w, we can fully recover the locally linear model g(x) = wx + b. In experiments, we show that BBp can generate complementary and highly interpretable explanations.
Author Information
Shengjie Wang ("University of Washington, Seattle")
Tianyi Zhou (University of Washington)
Tianyi Zhou is currently a PhD student at Paul G. Allen school of Computer Science and Engineering, University of Washington. He is supervised by Prof. Jeff Bilmes and Prof. Carlos Guestrin. He published ~50 papers at NeurIPS, ICML, ICLR, AISTATS, NAACL, KDD, ICDM, IJCAI, AAAI, ISIT, Machine Learning Journal, IEEE TIP, IEEE TNNLS, IEEE TKDE, etc, with ~1700 citations. He is the recipient of the Best student paper award at ICDM 2013.
Jeff Bilmes (UW)
Related Events (a corresponding poster, oral, or spotlight)

2019 Oral: Bias Also Matters: Bias Attribution for Deep Neural Network Explanation »
Thu Jun 13th 12:00  12:05 AM Room Seaside Ballroom
More from the Same Authors

2021 : Tighter mDPP Coreset Sample Complexity Bounds »
Gantavya Bhatt · Jeff Bilmes 
2021 : Tighter mDPP Coreset Sample Complexity Bounds »
Jeff Bilmes · Gantavya Bhatt 
2021 : More Information, Less Data »
Jeff Bilmes · Jeff Bilmes 
2021 : Introduction by the Organizers »
Abir De · Rishabh Lyer · Ganesh Ramakrishnan · Jeff Bilmes 
2021 Workshop: Subset Selection in Machine Learning: From Theory to Applications »
Rishabh Lyer · Abir De · Ganesh Ramakrishnan · Jeff Bilmes 
2020 Poster: Coresets for Dataefficient Training of Machine Learning Models »
Baharan Mirzasoleiman · Jeff Bilmes · Jure Leskovec 
2020 Poster: TimeConsistent SelfSupervision for SemiSupervised Learning »
Tianyi Zhou · Shengjie Wang · Jeff Bilmes 
2019 : Jeff Bilmes: Deep Submodular Synergies »
Jeff Bilmes 
2019 Poster: Jumpout : Improved Dropout for Deep Neural Networks with ReLUs »
Shengjie Wang · Tianyi Zhou · Jeff Bilmes 
2019 Poster: Combating Label Noise in Deep Learning using Abstention »
Sunil Thulasidasan · Tanmoy Bhattacharya · Jeff Bilmes · Gopinath Chennupati · Jamal MohdYusof 
2019 Oral: Jumpout : Improved Dropout for Deep Neural Networks with ReLUs »
Shengjie Wang · Tianyi Zhou · Jeff Bilmes 
2019 Oral: Combating Label Noise in Deep Learning using Abstention »
Sunil Thulasidasan · Tanmoy Bhattacharya · Jeff Bilmes · Gopinath Chennupati · Jamal MohdYusof 
2018 Poster: Constrained Interacting Submodular Groupings »
Andrew Cotter · Mahdi Milani Fard · Seungil You · Maya Gupta · Jeff Bilmes 
2018 Poster: Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions »
Wenruo Bai · Jeff Bilmes 
2018 Oral: Constrained Interacting Submodular Groupings »
Andrew Cotter · Mahdi Milani Fard · Seungil You · Maya Gupta · Jeff Bilmes 
2018 Oral: Greed is Still Good: Maximizing Monotone Submodular+Supermodular (BP) Functions »
Wenruo Bai · Jeff Bilmes