Training Deep Neural Networks on Noisy Labels with Bootstrapping on ShortScience.org

arxiv.org
scholar.google.com

Training Deep Neural Networks on Noisy Labels with Bootstrapping
Reed, Scott E. and Lee, Honglak and Anguelov, Dragomir and Szegedy, Christian and Erhan, Dumitru and Rabinovich, Andrew
arXiv e-Print archive - 2014 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by Léo Paillier 7 years ago

_Objective:_ Design a loss to make deep network robust to label noise.

_Dataset:_ [MNIST](yann.lecun.com/exdb/mnist/), Toroto Faces Database, [ILSVRC2014](http://www.image-net.org/challenges/LSVRC/2014/).


#### Inner-workings:

Three types of losses are presented:

*   reconstruciton loss:

[![screen shot 2017-06-26 at 11 00 07 am](https://user-images.githubusercontent.com/17261080/27532200-bb42b8a6-5a5f-11e7-8c14-673958216bfc.png)](https://user-images.githubusercontent.com/17261080/27532200-bb42b8a6-5a5f-11e7-8c14-673958216bfc.png)

*   soft bootstrapping which uses the predicted labels by the network `qk` and the user-provided labels `tk`:

[![screen shot 2017-06-26 at 11 10 43 am](https://user-images.githubusercontent.com/17261080/27532296-1e01a420-5a60-11e7-9273-d1affb0d7c2e.png)](https://user-images.githubusercontent.com/17261080/27532296-1e01a420-5a60-11e7-9273-d1affb0d7c2e.png)

*   hard bootstrapping replaces the soft predicted labels by their binary version:

[![screen shot 2017-06-26 at 11 12 58 am](https://user-images.githubusercontent.com/17261080/27532439-a3f9dbd8-5a60-11e7-91a7-327efc748eae.png)](https://user-images.githubusercontent.com/17261080/27532439-a3f9dbd8-5a60-11e7-91a7-327efc748eae.png)

[![screen shot 2017-06-26 at 11 13 05 am](https://user-images.githubusercontent.com/17261080/27532463-b52f4ab4-5a60-11e7-9aed-615109b61bd8.png)](https://user-images.githubusercontent.com/17261080/27532463-b52f4ab4-5a60-11e7-9aed-615109b61bd8.png)

#### Architecture:


They test with Feed Forward Neural Networks only.

#### Results:

They use only permutation noise with a very high probability compared with what we might encounter in real-life.

[![screen shot 2017-06-26 at 11 29 05 am](https://user-images.githubusercontent.com/17261080/27533105-b051d366-5a62-11e7-95f3-168d0d2d7841.png)](https://user-images.githubusercontent.com/17261080/27533105-b051d366-5a62-11e7-95f3-168d0d2d7841.png)

The improvement for small noise probability (<10%) might not be that interesting.

Your comment: