Learning with a Strong Adversary
Ruitong Huang
and
Bing Xu
and
Dale Schuurmans
and
Csaba Szepesvari
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
cs.LG
First published: 2015/11/10 (9 years ago) Abstract: The robustness of neural networks to intended perturbations has recently
attracted significant attention. In this paper, we propose a new method,
\emph{learning with a strong adversary}, that learns robust classifiers from
supervised data. The proposed method takes finding adversarial examples as an
intermediate step. A new and simple way of finding adversarial examples is
presented and experimentally shown to be efficient. Experimental results
demonstrate that resulting learning method greatly improves the robustness of
the classification models produced.
Huang et al. propose a variant of adversarial training called “learning with a strong adversary”. In spirit the idea is also similar to related work [1]. In particular, the authors consider the min-max objective
$\min_g \sum_i \max_{\|r^{(i)}\|\leq c} l(g(x_i + r^{(i)}), y_i)$
where $g$ ranges over expressible functions and $(x_i, y_i)$ is a training sample. In the remainder of the paper, Huang et al. Address the problem of efficiently computing $r^{(i)}$ – i.e. a strong adversarial example based on the current state of the network – and subsequently updating the weights of the network by computing the gradient of the augmented loss. Details can be found in the paper.
[1] T. Miyato, S. Maeda, M. Koyama, K. Nakae, S. Ishii. Distributional Smoothing by Virtual Adversarial Training. ArXiv:1507.00677, 2015.
Also see this summary at [davidstutz.de](https://davidstutz.de/category/reading/).