Song et al. propose generative adversarial examples, crafted using a generative adversarial network (GAN) from scratch. In particular a GAN is trained on the original images in order to approximate the generative data distribution. Then, adversarial examples can be found in the learned latent space by finding a latent code that minimizes a loss consisting of fooling the target classifier, not fooling an auxiliary classifier (to not change the actual class) and (optionally) staying close to some fixed random latent code. These adversarial examples do not correspond ot original images anymore, instead they are unrestricted and computed from scratch. Figure 1 shows examples.
Figure 1: Examples of projected gradient descent (PGD, top) to find adversarial examples in the image space, and found adversarial examples in the latent space, as proposed.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).