Massive Exploration of Neural Machine Translation Architectures
Britz, Denny
and
Goldie, Anna
and
Luong, Minh-Thang
and
Le, Quoc V.
arXiv e-Print archive - 2017 via Local Bibsonomy
Keywords:
dblp
Investigates different parameter choices for encoder-decoder NMT models. They find that LSTM is better than GRU, 2 bidirectional layers is enough, additive attention is the best, and a well-tuned beam search is important. They achieve good results on the WMT15 English->German task and release the code.
https://i.imgur.com/GaAsTvE.png