Efficient Evaluation-Time Uncertainty Estimation by Improved Distillation
Englesson, Erik
and
Azizpour, Hossein
arXiv e-Print archive - 2019 via Local Bibsonomy
Keywords:
dblp
Englesson and Azizpour propose an adapted knowledge distillation version to improve confidence calibration on out-of-distribution examples including adversarial examples. In contrast to vanilla distillation, they make the following changes: First, high capacity student networks are used, for example, by increasing depth or with. Then, the target distribution is “sharpened” using the true label by reducing the distributions overall entropy. Finally, for wrong predictions of the teacher model, they propose an alternative distribution with maximum mass on the correct class, while not losing the information provided on the incorrect label.
Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).