Summaries from Journal of Machine Learning Research on ShortScience.org

dl.acm.org
sci-hub
scholar.google.com

Dropout: a simple way to prevent neural networks from overfitting
Srivastava, Nitish and Hinton, Geoffrey E. and Krizhevsky, Alex and Sutskever, Ilya and Salakhutdinov, Ruslan
Journal of Machine Learning Research - 2014 via Local Bibsonomy
Keywords: dblp

[link] Summary by Martin Thoma 8 years ago

This paper is a much better introduction to Dropout than [Improving neural networks by preventing
co-adaptation of feature detectors](http://www.shortscience.org/paper?bibtexKey=journals/corr/1207.0580), written by the same authors two years later.

## General idea of Dropout

Dropout is a layer type. It has a parameter $\alpha \in (0, 1)$. The output dimensionality of a dropout layer is equal to its input dimensionality. With a probability of $\alpha$ any neurons output is set to 0. At testing time, the output of all neurons is multiplied with $\alpha$ to compensate for the fact that no output is set to 0.


## Interpretations

Dropout can be interpreted as training an ensemble of many networks, which share weights.

It can also be seen as a regularizer.