Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers on ShortScience.org

papers.nips.cc
scholar.google.com

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers
Salman, Hadi and Li, Jerry and Razenshteyn, Ilya P. and Zhang, Pengchuan and Zhang, Huan and Bubeck, Sébastien and Yang, Greg
Neural Information Processing Systems Conference - 2019 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by David Stutz 5 years ago

Salman et al. combined randomized smoothing with adversarial training based on an attack specifically designed against smoothed classifiers. Specifically, they consider the formulation of randomized smoothing by Cohen et al. [1]; here, Gaussian noise around the input (adversarial or clean) is sampled and the classifier takes a simple majority vote. In [1], Cohen et al. show that this results in good bounds on robustness. In this paper, Salman et al. propose an adaptive attack against randomized smoothing. Essentially, they use a simple PGD attack to attack a smoothed classifier, i.e., maximize the cross entropy loss of the smoothed classifier. To make the objective tractable, Monte Carlo samples are used in each iteration of the PGD optimization. Based on this attack, they do adversarial training, with adversarial examples computed against the smoothed (and adversarially trained) classifier. In experiments, this approach outperforms the certified robustness by Cohen et al. on several datasets.

[1] Jeremy M. Cohen, Elan Rosenfeld and J. Zico Kolter. Certified Adversarial Robustness via Randomized Smoothing. ArXiv, 1902.02918, 2019.

Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment: