[DR020] Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

Posted on October 12, 2017

According to the paper, previous methods to interpret neural network suffer from several disadvantages: (a) of low-resolution, (b) not class-discriminative and (c) limited to a specific type of network. Grad-CAM is an improvement of CAM. Since the information is abstracted layer by layer and the fully-connectedly layer cannot maintain the spatial information, they select the last convolutional layer for analysis.

For a class c and a filter k, they compute the contribution of this filter to the class classification as:

where y_c the class score for class c (before softmax) and A_{ij} is an element of the last convoluted output. The final Grad-CAM a weighted sum across all channels:

ReLU is employed to keep the positive influence, which means the remain activations have a positive relationship with the given class label. By reversing the positive and negative of ReLU layer, the Gram-CAM can localize the counterfactual areas, based on which the neural network "believes" current label is wrong. Besides, to have a higher resolution, they combine Grad-CAM with Guided Backpropagation to refine the result.


Although the model is very simple, they conduct extensive experiments and interpreted neural networks in a variety of tasks. I strongly suggest you read the experiments yourselves.


[DR020]: TBA