Timezone: »

Oral
Bias Also Matters: Bias Attribution for Deep Neural Network Explanation
Shengjie Wang · Tianyi Zhou · Jeff Bilmes

Wed Jun 12 05:00 PM -- 05:05 PM (PDT) @ Seaside Ballroom
The gradient of a deep neural network (DNN) w.r.t. the input provides information that can be used to explain the output prediction in terms of the input features and has been widely studied to assist in interpreting DNNs. In a linear model (i.e., $g(x)=wx+b$), the gradient corresponds solely to the weights $w$. Such a model can reasonably locally linearly approximate a smooth nonlinear DNN, and hence the weights of this local model are the gradient. The other part, however, of a local linear model, i.e., the bias $b$, is usually overlooked in attribution methods since it is not part of the gradient. In this paper, we observe that since the bias in a DNN also has a non-negligible contribution to the correctness of predictions, it can also play a significant role in understanding DNN behaviors. In particular, we study how to attribute a DNN's bias to its input features. We propose a backpropagation-type algorithm bias back-propagation (BBp)'' that starts at the output layer and iteratively attributes the bias of each layer to its input nodes as well as combining the resulting bias term of the previous layer. This process stops at the input layer, where summing up the attributions over all the input features exactly recovers $b$. Together with the backpropagation of the gradient generating $w$, we can fully recover the locally linear model $g(x)=wx+b$. Hence, the attribution of the DNN outputs to its inputs is decomposed into two parts, the gradient $w$ and the bias attribution, providing separate and complementary explanations. We study several possible attribution methods applied to the bias of each layer in BBp. In experiments, we show that BBp can generate complementary and highly interpretable explanations of DNNs in addition to gradient-based attributions.

#### Author Information

##### Tianyi Zhou (University of Washington)

Tianyi Zhou is currently a PhD student at Paul G. Allen school of Computer Science and Engineering, University of Washington. He is supervised by Prof. Jeff Bilmes and Prof. Carlos Guestrin. He published ~50 papers at NeurIPS, ICML, ICLR, AISTATS, NAACL, KDD, ICDM, IJCAI, AAAI, ISIT, Machine Learning Journal, IEEE TIP, IEEE TNNLS, IEEE TKDE, etc, with ~1700 citations. He is the recipient of the Best student paper award at ICDM 2013.