Certified Adversarial Robustness via Randomized Smoothing
Jeremy M Cohen
and
Elan Rosenfeld
and
J. Zico Kolter
arXiv e-Print archive - 2019 via Local arXiv
Keywords:
cs.LG, stat.ML
First published: 2019/02/08 (5 years ago) Abstract: We show how to turn any classifier that classifies well under Gaussian noise
into a new classifier that is certifiably robust to adversarial perturbations
under the $\ell_2$ norm. This "randomized smoothing" technique has been
proposed recently in the literature, but existing guarantees are loose. We
prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian
noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a
certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$
norm less than 0.5 (=127/255). No certified defense has been shown feasible on
ImageNet except for smoothing. On smaller-scale datasets where competing
approaches to certified $\ell_2$ robustness are viable, smoothing delivers
higher certified accuracies. Our strong empirical results suggest that
randomized smoothing is a promising direction for future research into
adversarially robust classification. Code and models are available at
http://github.com/locuslab/smoothing.