One-shot Learning with Memory-Augmented Neural Networks
Adam Santoro
and
Sergey Bartunov
and
Matthew Botvinick
and
Daan Wierstra
and
Timothy Lillicrap
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.LG
First published: 2016/05/19 (8 years ago) Abstract: Despite recent breakthroughs in the applications of deep neural networks, one
setting that presents a persistent challenge is that of "one-shot learning."
Traditional gradient-based networks require a lot of data to learn, often
through extensive iterative training. When new data is encountered, the models
must inefficiently relearn their parameters to adequately incorporate the new
information without catastrophic interference. Architectures with augmented
memory capacities, such as Neural Turing Machines (NTMs), offer the ability to
quickly encode and retrieve new information, and hence can potentially obviate
the downsides of conventional models. Here, we demonstrate the ability of a
memory-augmented neural network to rapidly assimilate new data, and leverage
this data to make accurate predictions after only a few samples. We also
introduce a new method for accessing an external memory that focuses on memory
content, unlike previous methods that additionally use memory location-based
focusing mechanisms.