On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models
Sven Gowal
and
Krishnamurthy Dvijotham
and
Robert Stanforth
and
Rudy Bunel
and
Chongli Qin
and
Jonathan Uesato
and
Relja Arandjelovic
and
Timothy Mann
and
Pushmeet Kohli
arXiv e-Print archive - 2018 via Local arXiv
Keywords:
cs.LG, cs.CR, stat.ML
First published: 2018/10/30 (6 years ago) Abstract: Recent work has shown that it is possible to train deep neural networks that
are verifiably robust to norm-bounded adversarial perturbations. Most of these
methods are based on minimizing an upper bound on the worst-case loss over all
possible adversarial perturbations. While these techniques show promise, they
remain hard to scale to larger networks. Through a comprehensive analysis, we
show how a careful implementation of a simple bounding technique, interval
bound propagation (IBP), can be exploited to train verifiably robust neural
networks that beat the state-of-the-art in verified accuracy. While the upper
bound computed by IBP can be quite weak for general networks, we demonstrate
that an appropriate loss and choice of hyper-parameters allows the network to
adapt such that the IBP bound is tight. This results in a fast and stable
learning algorithm that outperforms more sophisticated methods and achieves
state-of-the-art results on MNIST, CIFAR-10 and SVHN. It also allows us to
obtain the first verifiably robust model on a downscaled version of ImageNet.