Workshop: XXAI: Extending Explainable AI Beyond Deep Models and Classifiers
Contributed Talk 1: Sun et al. - Understanding Image Captioning Models beyond Visualizing Attention
This paper explains predictions of image captioning attention models beyond visualizing the attention itself. In this paper, we develop variants of layer-wise relevance backpropagation (LRP) tailored to image captioning models with attention mechanisms. We show that the explanations, firstly, correlate to object locations with higher precision than attention, secondly, identify object words that are unsupported by image content, and thirdly, provide guidance to improve the model. Results are reported using two different image captioning attention models trained with Flickr30K and MSCOCO2017 datasets. Experimental analyses show the strength of explanation methods for understanding image captioning attention models.