Sequence to Sequence Learning with Neural Networks on ShortScience.org

papers.nips.cc
scholar.google.com

Sequence to Sequence Learning with Neural Networks
Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V.
Neural Information Processing Systems Conference - 2014 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 3

[link] Summary by Abhishek Das 7 years ago

This paper presents a simple approach to predicting
sequences from sequential input. They use a multi-layer
LSTM-based encoder-decoder architecture and show
promising results on the task of neural machine translation.
Their approach beats a phrase-based statistical machine
translation system by a BLEU score of > 1.0 and is close to
state-of-the-art if used to re-rank 1000-best predictions
from the SMT system. Main contributions:

- The first LSTM encodes an input sequence to a single
vector, which is then decoded by a second LSTM. End of sequence
is indicated by a special character.
    - 4-layer deep LSTMs.
    - 160k source vocabulary, 80k target vocabulary. Trained on
    12M sentences. Words in output sequence are generated by a softmax
    over fixed vocabulary.
    - Beam search is used at test time to predict translations
    (Beam size 2 does best).

## Strengths

- Qualitative results (PCA projections) show that learned representations are
fairly insensitive to active/passive voice, as sentences similar in meaning
are clustered together.

- Another interesting observation was that reversing the source
sequence gives a significant boost to translation of long sentences
and results in performance gain, most likely due to the introduction of
short-term dependencies that are more easily captured by the gradients.

## Weaknesses / Notes

- The reversing source input idea needs better justification,
otherwise comes across as an 'ugly hack'.

- To re-score the n-best list of predictions of the baseline,
they average confidences of LSTM and baseline model. They should
have reported re-ranking accuracies by using just the LSTM-model
confidences.

Your comment: