Improving neural networks by preventing co-adaptation of feature detectors on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton and Nitish Srivastava and Alex Krizhevsky and Ilya Sutskever and Ruslan R. Salakhutdinov
arXiv e-Print archive - 2012 via Local arXiv
Keywords: cs.NE, cs.CV, cs.LG
more

Summaries/Notes 1

[link] Summary by Martin Thoma 7 years ago

This paper introduced Dropout, a new layer type. It has a parameter $\alpha \in (0, 1)$. The output dimensionality of a dropout layer is equal to its input dimensionality. With a probability of $\alpha$ any neurons output is set to 0. At testing time, the output of all neurons is multiplied with $\alpha$ to compensate for the fact that no output is set to 0.

A much better paper, by the same authors but 2 years later, is [Dropout: a simple way to prevent neural networks from overfitting](http://www.shortscience.org/paper?bibtexKey=journals/jmlr/SrivastavaHKSS14).

Dropout can be interpreted as training an ensemble of many networks, which share weights.

It was notably used by [ImageNet Classification with Deep Convolutional Neural Networks](http://www.shortscience.org/paper?bibtexKey=krizhevsky2012imagenet).

Your comment: