Dropout: a simple way to prevent neural networks from overfitting
Srivastava, Nitish
and
Hinton, Geoffrey E.
and
Krizhevsky, Alex
and
Sutskever, Ilya
and
Salakhutdinov, Ruslan
Journal of Machine Learning Research - 2014 via Local Bibsonomy
Keywords:
dblp
This paper is a much better introduction to Dropout than [Improving neural networks by preventing
co-adaptation of feature detectors](http://www.shortscience.org/paper?bibtexKey=journals/corr/1207.0580), written by the same authors two years later.
## General idea of Dropout
Dropout is a layer type. It has a parameter $\alpha \in (0, 1)$. The output dimensionality of a dropout layer is equal to its input dimensionality. With a probability of $\alpha$ any neurons output is set to 0. At testing time, the output of all neurons is multiplied with $\alpha$ to compensate for the fact that no output is set to 0.
## Interpretations
Dropout can be interpreted as training an ensemble of many networks, which share weights.
It can also be seen as a regularizer.