First published: 2017/10/31 (5 years ago) Abstract: Machine translation has recently achieved impressive performance thanks to
recent advances in deep learning and the availability of large-scale parallel
corpora. There have been numerous attempts to extend these successes to
low-resource language pairs, yet requiring tens of thousands of parallel
sentences. In this work, we take this research direction to the extreme and
investigate whether it is possible to learn to translate even without any
parallel data. We propose a model that takes sentences from monolingual corpora
in two different languages and maps them into the same latent space. By
learning to reconstruct in both languages from this shared feature space, the
model effectively learns to translate without using any labeled data. We
demonstrate our model on two widely used datasets and two language pairs,
reporting BLEU scores up to 32.8, without using even a single parallel sentence
at training time.
The model learns to translate using a seq2seq model, an autoencoder objective, and an adversarial objective for language identification.
The system is trained to correct noisy versions of its own output and iteratively improves performance.
Does not require parallel corpora, but relies on a separate method for inducing a parallel dictionary that bootstraps the translation.