Gehr et al. propose a method based on abstract interpretations in order to verify robustness guarantees of neural networks. First of all, I want to note that (in contrast to most work in adversarial robustness) the proposed method is not intended to improve robustness, but to get robustness certificates. Without going into details, abstract interpretations allow to verify conditions (e.g., robustness) of a function (e.g., a neural network) based on abstractions of the input. In particular, by abstracting a norm-ball around a test sample (as is typically considered in adversarial robustness) using box constraints or polyhedra, leading to an over-approximation of the norm-ball, and transforming these abstractions according to the layers of a network, the network’s output can be checked against robustness conditions without running the network on all individual points in the norm-ball. As a result, if the proposed method certifies robustness for a given input sample and an area around it, the network s indeed robust in this area (soundness). If not, the network might indeed not be robust, or robustness could not be certified due to the method’s over-approximation. For details, I refer to the paper, as well as follow-up work  and .
 Matthew Mirman, Timon Gehr, Martin T. Vechev: Differentiable Abstract Interpretation for Provably Robust Neural Networks. ICML 2018: 3575-3583
 Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, Martin T. Vechev: Fast and Effective Robustness Certification. NeurIPS 2018: 10825-10836
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).