Grad-CAM is a technique for “visually interpreting” the predictions of a Convolutional Neural Network (CNN)-based model. This technique essentially uses the gradients of any target concept (a predicted class such as “cat”), flowing into the final convolutional layer to produce a coarse localization map attending regions in the image that are important for prediction of the concept.
This implementation follows the original paper of Grad-CAM called “Visual Explanations from Deep Networks via Gradient-based Localization”.
Image Reading and Manipulation
We will use the OpenCV library to read in images, crop them first and then resize them. This is necessary in order for the images to be used with the pretrained VGG16 neural network.
One of the principles of Grad-CAM is guided backpropagation. For this purpose we want to respecify the TensorFlow RELU activation function gradient, which is used to backpropagate in our deep neural network classifier.