[link]
Summary by Jon Gauthier 8 years ago
This article makes the argument for *interactive* language learning, motivated by some nice recent small-domain success. I can certainly agree with the motivation: if language is used in conversation, shouldn't we be building models which know how to behave in conversation?
The authors develop a standard multi-agent communication paradigm, where two agents learn communicate in a single-round reference game. (No references to e.g. Kirby or any [ILM][1] work, which is in the same space.) Agent `A1` examines a referent `R` and "transmits" a one-hot utterance representation to `A2`, who must successfully identify `R` given the utterance. The one-round conversations are a success when `A2` picks the correct referent `R`. The two agents are jointly trained to maximize this success metric via REINFORCE (policy gradient).
**This is mathematically equivalent to [the NVIL model (Mnih and Gregor, 2014)][2]**, an autoencoder with "hard" latent codes which is likewise trained by policy gradient methods.
They perform a nice thorough evaluation on both successful and "cheating" models. This will serve as a useful reference point / starting point for people interested in interactive language acquisition.
The way clear is forward, I think: let's develop agents in more complex environments, interacting in multi-round conversations, with more complex / longer utterances.
[1]: http://cocosci.berkeley.edu/tom/papers/IteratedLearningEvolutionLanguage.pdf
[2]: https://arxiv.org/abs/1402.0030

more
less