Efficient Defenses Against Adversarial Attacks
Valentina Zantedeschi
and
Maria-Irina Nicolae
and
Ambrish Rawat
arXiv e-Print archive - 2017 via Local arXiv
Keywords:
cs.LG
First published: 2017/07/21 (7 years ago) Abstract: Following the recent adoption of deep neural networks (DNN) accross a wide
range of applications, adversarial attacks against these models have proven to
be an indisputable threat. Adversarial samples are crafted with a deliberate
intention of undermining a system. In the case of DNNs, the lack of better
understanding of their working has prevented the development of efficient
defenses. In this paper, we propose a new defense method based on practical
observations which is easy to integrate into models and performs better than
state-of-the-art defenses. Our proposed solution is meant to reinforce the
structure of a DNN, making its prediction more stable and less likely to be
fooled by adversarial samples. We conduct an extensive experimental study
proving the efficiency of our method against multiple attacks, comparing it to
numerous defenses, both in white-box and black-box setups. Additionally, the
implementation of our method brings almost no overhead to the training
procedure, while maintaining the prediction performance of the original model
on clean samples.