Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations
Lamb, Alex
and
Binas, Jonathan
and
Goyal, Anirudh
and
Serdyuk, Dmitriy
and
Subramanian, Sandeep
and
Mitliagkas, Ioannis
and
Bengio, Yoshua
arXiv e-Print archive - 2018 via Local Bibsonomy
Keywords:
dblp
Lamb et al. introduce fortified networks with denoising auto encoders as hidden layers. These denoising auto encoders are meant to learn the manifold of hidden representations, project adversarial input back to the manifold and improve robustness. The main idea is illustrated in Figure 1. The denoising auto encoders can be added at any layer and are trained jointly with the classification network – either on the original input, or on adversarial examples as done in adversarial training.
https://i.imgur.com/5vaZrVk.png
Figure 1: Illustration of a fortified layer, i.e., a hidden layer that is reconstructed through a denoising auto encoder as defense mechanism. The denoising auto encoders are trained jointly with the network.
In experiments, they show that the proposed defense mechanism improves robustness on MNIST and CIFAR, compared to adversarial training and other baselines. The improvements are, however, very marginal. Especially, as the proposed method imposes an additional overhead (in addition to adversarial training).
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).