Towards Imperceptible and Robust Adversarial Example Attacks against Neural Networks
Bo Luo
and
Yannan Liu
and
Lingxiao Wei
and
Qiang Xu
arXiv e-Print archive - 2018 via Local arXiv
Keywords:
cs.LG, cs.CR, stat.ML
First published: 2018/01/15 (6 years ago) Abstract: Machine learning systems based on deep neural networks, being able to produce
state-of-the-art results on various perception tasks, have gained mainstream
adoption in many applications. However, they are shown to be vulnerable to
adversarial example attack, which generates malicious output by adding slight
perturbations to the input. Previous adversarial example crafting methods,
however, use simple metrics to evaluate the distances between the original
examples and the adversarial ones, which could be easily detected by human
eyes. In addition, these attacks are often not robust due to the inevitable
noises and deviation in the physical world. In this work, we present a new
adversarial example attack crafting method, which takes the human perceptual
system into consideration and maximizes the noise tolerance of the crafted
adversarial example. Experimental results demonstrate the efficacy of the
proposed technique.
Luo et al. Propose a method to compute less-perceptible adversarial examples compared to standard methods constrained in $L_p$ norms. In particular, they consider the local variation of the image and argue that humans are more likely to notice larger variations in low-variance regions than vice-versa. The sensitivity of a pixel is therefore defined as one over its local variance, meaning that it is more sensitive to perturbations. They propose a simple algorithm which iteratively sorts pixels by their sensitivity and then selects a subset to perturb each step. Personally, I wonder why they do not integrate the sensitivity into simple projected gradient descent attacks, where a Lagrange multiplier is used to enforce the $L_p$ norm of the sensitivity weighted perturbation. However, qualitative results show that their approach also works well and results in (partly) less perceptible changes, see Figure 1.
https://i.imgur.com/M7Ile8Y.png
Figure 1: Qualitative results including a comparison to other state-of-the-art attacks.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).