Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization
Uri Shaham
and
Yutaro Yamada
and
Sahand Negahban
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
stat.ML, cs.LG, cs.NE
First published: 2015/11/17 (9 years ago) Abstract: We propose a general framework for increasing local stability of Artificial
Neural Nets (ANNs) using Robust Optimization (RO). We achieve this through an
alternating minimization-maximization procedure, in which the loss of the
network is minimized over perturbed examples that are generated at each
parameter update. We show that adversarial training of ANNs is in fact
robustification of the network optimization, and that our proposed framework
generalizes previous approaches for increasing local stability of ANNs.
Experimental results reveal that our approach increases the robustness of the
network to existing adversarial examples, while making it harder to generate
new ones. Furthermore, our algorithm improves the accuracy of the network also
on the original test data.