Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
Nicholas Carlini
and
David Wagner
arXiv e-Print archive - 2017 via Local arXiv
Keywords:
cs.LG, cs.CR, cs.CV
First published: 2017/05/20 (7 years ago) Abstract: Neural networks are known to be vulnerable to adversarial examples: inputs
that are close to natural inputs but classified incorrectly. In order to better
understand the space of adversarial examples, we survey ten recent proposals
that are designed for detection and compare their efficacy. We show that all
can be defeated by constructing new loss functions. We conclude that
adversarial examples are significantly harder to detect than previously
appreciated, and the properties believed to be intrinsic to adversarial
examples are in fact not. Finally, we propose several simple guidelines for
evaluating future proposed defenses.