Recurrent Neural Network Regularization on ShortScience.org

arxiv.org
scholar.google.com

Recurrent Neural Network Regularization
Zaremba, Wojciech and Sutskever, Ilya and Vinyals, Oriol
arXiv e-Print archive - 2014 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 2

[link] Summary by Denny Britz 9 years ago

TLDR; The authors show that applying dropout to only the **non-recurrent** connections (between layers of the same timestep) in an LSTM works well, improving the scores on various sequence tasks.

#### Data Sets and model performance

- PTB Language Modeling Perplexity: 78.4
- Google Icelandic Speech Dataset WER Accuracy: 70.5
- WMT'14 English to French Machine Translation BLEU: 29.03
- MS COCO Image Caption Generation BLEU: 24.3

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private