Adversarial Patch
Brown, Tom B.
and
Mané, Dandelion
and
Roy, Aurko
and
Abadi, Martín
and
Gilmer, Justin
arXiv e-Print archive - 2017 via Local Bibsonomy
Keywords:
dblp
Brown et al. Introduce a universal adversarial patch that, when added to an image, will cause a targeted misclassification. The concept is illustrated in Figure 1; essentially, a “sticker” is computed that, when placed randomly on an image, causes misclassification. In practice, the objective function optimized can be written as
$\max_p \mathbb{E}_{x\sim X, t \sim T, l \sim L} \log p(y|A(p,x,l,t))$
where $y$ is the target label and $X$, $T$ and $L$ are te data space, the transformation space and the location space, respectively. The function $A$ takes as input the image and the patch and places the adversarial patch on the image according to the transformation and the location $t$ and $p$. Note that the adversarial patch is unconstrained (in contrast to general adversarial examples). In practice, the computed patch might look as illustrated in Figure 1.
https://i.imgur.com/a0AB6Wz.png
Figure 1: Illustration of the optimization procedure to obtain adversarial patches.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).