Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform on ShortScience.org

papers.nips.cc
scholar.google.com

Thwarting Adversarial Examples: An L_0-Robust Sparse Fourier Transform
Bafna, Mitali and Murtagh, Jack and Vyas, Nikhil
Neural Information Processing Systems Conference - 2018 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by David Stutz 4 years ago

Bafna et al. show that iterative hard thresholding results in $L_0$ robust Fourier transforms. In particular, as shown in Algorithm 1, iterative hard thresholding assumes a signal $y = x + e$ where $x$ is assumed to be sparse, and $e$ is assumed to be sparse. This translates to noise $e$ that is bounded in its $L_0$ norm, corresponding to common adversarial attacks such as adversarial patches in computer vision. Using their algorithm, the authors can provably reconstruct the signal, specifically the top-$k$ coordinates for a $k$-sparse signal, which can subsequently be fed to a neural network classifier. In experiments, the classifier is always trained on sparse signals, and at test time, the sparse signal is reconstructed prior to the forward pass. This way, on MNIST and Fashion-MNIST, the algorithm is able to recover large parts of the original accuracy.

https://i.imgur.com/yClXLoo.jpg
Algorithm 1 (see paper for details): The iterative hard thresholding algorithm resulting in provable robustness against $L_0$ attack on images and other signals.

Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private