First published: 2017/08/05 (5 years ago) Abstract: Deep neural networks (DNNs) provide state-of-the-art results on various tasks
and are widely used in real world applications. However, it was discovered that
machine learning models, including the best performing DNNs, suffer from a
fundamental problem: they can unexpectedly and confidently misclassify examples
formed by slightly perturbing otherwise correctly recognized inputs. Various
approaches have been developed for efficiently generating these so-called
adversarial examples, but those mostly rely on ascending the gradient of loss.
In this paper, we introduce the novel logits optimized targeting system (LOTS)
to directly manipulate deep features captured at the penultimate layer. Using
LOTS, we analyze and compare the adversarial robustness of DNNs using the
traditional Softmax layer with Openmax, which was designed to provide open set
recognition by defining classes derived from deep representations, and is
claimed to be more robust to adversarial perturbations. We demonstrate that
Openmax provides less vulnerable systems than Softmax to traditional attacks,
however, we show that it can be equally susceptible to more sophisticated
adversarial generation techniques that directly work on deep representations.
Rozsa et al. describe an adersarial attack against OpenMax  by directly targeting the logits. Specifically, they assume a network using OpenMax instead of a SoftMax layer to compute the final class probabilities. OpenMax allows “open-set” networks by also allowing to reject input samples. By directly targeting the logits of the trained network, i.e. iteratively pushing the logits in a target direction, it does not matter whether SoftMax or OpenMax layers are used on top, the network can be fooled in both cases.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).