This paper presents the theoretical notion of ensemble robustness and how it might provide an explanation for the success of deep learning algorithms. This work is an extension of some of the author's previous work (see Definition 2), demonstrating a theoretical relationship between a notion of robustness to adversarial examples and generalization performance. One initial observation made in this work is that this previous notion of robustness cannot explain the good performance of deep neural networks, since they have been shown to in fact not be robust to adversarial examples.
So in this paper, the authors propose to study a notion of ensemble robustness (see Definition 3), and show that it can also be linked to generalization performance (see Theorem 1 and Corollary 1). The "ensemble" part comes from taking into account the stochasticity of the learning algorithm, i.e. the fact that the models they produce can vary from one run to another, even if applied on the same training set. The stochasticity here can come from the use of dropout, of SGD with random ordering of the training examples or from the random parameter initialization. Other theoretical results are also presented, such as one relating the variance of the robustness to generalization performance and another specific to the use of dropout.
Finally, the paper also proposes a semi-supervised learning algorithm inspired from their definition of ensemble robustness, in which a model is trained to classify the perturbed (adversarial) version of an example in the same class as the original (non perturbed) example. On MNIST, they achieve excellent results, matching the performance of the state-of-the-art Ladder Networks.