[link]
Summary by Hadrien Bertrand 5 years ago
The paper designs some basic tests to compare saliency methods. It founds that some of the most popular methods are independent of model parameters and the data, meaning they are effectively useless.
## Methods compared
The paper compare the following methods: gradient explanation, gradient x input, integrated gradients, guided backprop, guided GradCam and SmoothGrad. They provide a refresher on those methods in the appendix.
All those methods can be put in the same framework. They require a classification model and an input (typically an image). The output of the method is an *explanation map* of the shape of the input where a higher value for a feature implies greater relevance in the decision of the model.
## Metrics of comparison
The authors argue that visual inspection of the saliency maps can be misleading. They propose to compute the Spearman rank correlation, the structural similarity index (SSMI) and the Pearson correlation of the histogram of gradients. The authors point out that those metrics capture various notions of similarity, but it is an active area of research and those metrics are imperfect.
## First test: model parameters randomization
A saliency method must be dependent of model parameters, otherwise it cannot help us understand a model. In this test, the authors randomize the model parameters, layer per layer, starting from the top.
Surprisingly, methods such as guided backprop and guided gradcam are completely insensitive to model parameters, as illustrated on this Inception v3 trained on ImageNet:
![image](https://user-images.githubusercontent.com/8659132/61403152-b10b8000-a8a2-11e9-9f6a-cf1ed6a876cc.png)
Integrated gradients looks also dubious as the bird is still visible with a mostly fully randomized model, but the quantitative metrics reveal the difference is actually big between the two models.
## Second test: data randomization
It is well-known that randomly shuffling the labels of a dataset does not prevent a neural network from getting a high accuracy on the training set, though it does prevent generalization. The model is able to learn by either memorizing the data or finding spurious patterns. As a result, saliency maps obtained from such a network should have no clearly interpretable signal.
Here is the result for a ConvNet trained on MNIST and a shuffled MNIST:
![image](https://user-images.githubusercontent.com/8659132/61406757-7efe1c00-a8aa-11e9-9826-a859a373cb4f.png)
The results are very damning for most methods. Only gradients and GradCam are very different between both models, as confirmed by the low correlation.
## Discussion
- Even though some methods do no depend on model parameters and data, they might still depend on the architecture of the models, which could be of some use in some contexts.
- Methods that multiply the input with the gradient are dominated by the input.
- Complex saliency methods are just fancy edge detectors.
- Only gradient, smooth gradient and GradCam survives the sanity checks.
# Comments
- Why is their GradCam maps so ugly? They don't look like usual GradCam maps at all.
- Their tests are simple enough that it's hard to defend a method that doesn't pass them.
- The methods that are left are not very good either. They give fuzzy maps that are difficult to interpret.
- In the case of integrated gradients (IG), I'm not convinced this is sufficient to discard the method. IG requires a "baseline input" that represents the absence of features. In the case of images, people usually just set the image to 0, which is not at all the absence of a feature. The authors also use the "set the image to 0" strategy, and I'd say their tests are damning for this strategy, not for IG in general. I'd expect an estimation of the baseline such as done in [this paper](https://arxiv.org/abs/1702.04595) would be a fairer evaluation of IG.
Code: [GitHub](https://github.com/adebayoj/sanity_checks_saliency) (not available as of 17/07/19)
more
less