Sanity Checks for Saliency Maps on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Sanity Checks for Saliency Maps
Julius Adebayo and Justin Gilmer and Michael Muelly and Ian Goodfellow and Moritz Hardt and Been Kim
arXiv e-Print archive - 2018 via Local arXiv
Keywords: cs.CV, cs.LG, stat.ML
more

Summaries/Notes 2

[link] Summary by Hadrien Bertrand 5 years ago

The paper designs some basic tests to compare saliency methods. It founds that some of the most popular methods are independent of model parameters and the data, meaning they are effectively useless.

## Methods compared

The paper compare the following methods: gradient explanation, gradient x input, integrated gradients, guided backprop, guided GradCam and SmoothGrad. They provide a refresher on those methods in the appendix.

All those methods can be put in the same framework. They require a classification model and an input (typically an image). The output of the method is an *explanation map* of the shape of the input where a higher value for a feature implies greater relevance in the decision of the model.

## Metrics of comparison

The authors argue that visual inspection of the saliency maps can be misleading. They propose to compute the Spearman rank correlation, the structural similarity index (SSMI) and the Pearson correlation of the histogram of gradients. The authors point out that those metrics capture various notions of similarity, but it is an active area of research and those metrics are imperfect.

## First test: model parameters randomization

A saliency method must be dependent of model parameters, otherwise it cannot help us understand a model. In this test, the authors randomize the model parameters, layer per layer, starting from the top.

Surprisingly, methods such as guided backprop and guided gradcam are completely insensitive to model parameters, as illustrated on this Inception v3 trained on ImageNet:

![image](https://user-images.githubusercontent.com/8659132/61403152-b10b8000-a8a2-11e9-9f6a-cf1ed6a876cc.png)

Integrated gradients looks also dubious as the bird is still visible with a mostly fully randomized model, but the quantitative metrics reveal the difference is actually big between the two models.

## Second test: data randomization

It is well-known that randomly shuffling the labels of a dataset does not prevent a neural network from getting a high accuracy on the training set, though it does prevent generalization. The model is able to learn by either memorizing the data or finding spurious patterns. As a result, saliency maps obtained from such a network should have no clearly interpretable signal.

Here is the result for a ConvNet trained on MNIST and a shuffled MNIST:

![image](https://user-images.githubusercontent.com/8659132/61406757-7efe1c00-a8aa-11e9-9826-a859a373cb4f.png)

The results are very damning for most methods. Only gradients and GradCam are very different between both models, as confirmed by the low correlation.

## Discussion

- Even though some methods do no depend on model parameters and data, they might still depend on the architecture of the models, which could be of some use in some contexts.
- Methods that multiply the input with the gradient are dominated by the input.
- Complex saliency methods are just fancy edge detectors.
- Only gradient, smooth gradient and GradCam survives the sanity checks.

# Comments

- Why is their GradCam maps so ugly? They don't look like usual GradCam maps at all.
- Their tests are simple enough that it's hard to defend a method that doesn't pass them.
- The methods that are left are not very good either. They give fuzzy maps that are difficult to interpret.
- In the case of integrated gradients (IG), I'm not convinced this is sufficient to discard the method. IG requires a "baseline input" that represents the absence of features. In the case of images, people usually just set the image to 0, which is not at all the absence of a feature. The authors also use the "set the image to 0" strategy, and I'd say their tests are damning for this strategy, not for IG in general. I'd expect an estimation of the baseline such as done in [this paper](https://arxiv.org/abs/1702.04595) would be a fairer evaluation of IG.

Code: [GitHub](https://github.com/adebayoj/sanity_checks_saliency) (not available as of 17/07/19)

Your comment:

[link] Summary by Apoorva Shetty 5 years ago

**Idea:** With the growing use of visual explanation systems of machine learning models such as saliency maps, there needs to be a standardized method of verifying if a saliency method is correctly describing the underlying ML model.

**Solution:** In this paper two Sanity Checks have been proposed to verify the accuracy and the faithfulness of a saliency method:
* *Model parameter randomization test:* In this sanity check the outputs of a saliency method on a trained model is compared to that of the same method on an untrained randomly parameterized model. If these images are similar/identical then this saliency method does not correctly describe the model. In the course of this experiment it is found that certain methods such as the Guided BackProp are constant in their explanations despite alterations in the model.
* *Data Randomization Test:* This method explores the relationship of saliency methods to data and their associated labels. In this test, the labels of the training data are randomized thus there should be no definite pattern describing the model (Since the model is as good as randomly guessing an output label). If there is a definite pattern, this shows that the saliency methods are independent of the underlying model/training data labels. In this test as well Guided BackProp did not fare well, implying this saliency method is as good as an edge detector as opposed to a ML explainer.

Thus this paper makes a valid argument toward having standardized tests that an interpretation model must satisfy to be deemed accurate or faithful.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private