First published: 2017/06/08 (4 years ago) Abstract: We consider the problem of detecting out-of-distribution images in neural
networks. We propose ODIN, a simple and effective method that does not require
any change to a pre-trained neural network. Our method is based on the
observation that using temperature scaling and adding small perturbations to
the input can separate the softmax score distributions between in- and
out-of-distribution images, allowing for more effective detection. We show in a
series of experiments that ODIN is compatible with diverse network
architectures and datasets. It consistently outperforms the baseline approach
by a large margin, establishing a new state-of-the-art performance on this
task. For example, ODIN reduces the false positive rate from the baseline 34.7%
to 4.3% on the DenseNet (applied to CIFAR-10) when the true positive rate is
95%.
## Task
Add '**rejection**' output to an existing classification model with softmax layer.
## Method
1. Choose some threshold $\delta$ and temperature $T$
2. Add a perturbation to the input x (eq 2),
let $\tilde x = x - \epsilon \text{sign}(-\nabla_x \log S_{\hat y}(x;T))$
3. If $p(\tilde x;T)\le \delta$, rejects
4. If not, return the output of the original classifier
$p(\tilde x;T)$ is the max prob with temperature scailing for input $\tilde x$
$\delta$ and $T$ are manually chosen.