First published: 2018/02/09 (6 years ago) Abstract: Adversarial examples that fool machine learning models, particularly deep
neural networks, have been a topic of intense research interest, with attacks
and defenses being developed in a tight back-and-forth. Most past defenses are
best effort and have been shown to be vulnerable to sophisticated attacks.
Recently a set of certified defenses have been introduced, which provide
guarantees of robustness to norm-bounded attacks, but they either do not scale
to large datasets or are limited in the types of models they can support. This
paper presents the first certified defense that both scales to large networks
and datasets (such as Google's Inception network for ImageNet) and applies
broadly to arbitrary model types. Our defense, called PixelDP, is based on a
novel connection between robustness against adversarial examples and
differential privacy, a cryptographically-inspired formalism, that provides a
rigorous, generic, and flexible foundation for defense.