Meng and Chen propose MagNet, a combination of adversarial example detection and removal. At test time, given a clean or adversarial test image, the proposed defense works as follows: First, the input is passed through one or multiple detectors. If one of these detectors fires, the input is rejected. To this end, the authors consider detection based on the reconstruction error of an auto-encoder or detection based on the divergence between probability predictions (on adversarial vs. clean example). Second, if not rejected, the input is passed through a reformed. The reformer reconstructs the input, e.g., through an auto-encoder, to remove potentially undetected adversarial noise.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).