Using Fast Weights to Attend to the Recent Past
Jimmy Ba
and
Geoffrey Hinton
and
Volodymyr Mnih
and
Joel Z. Leibo
and
Catalin Ionescu
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
stat.ML, cs.LG, cs.NE
First published: 2016/10/20 (8 years ago) Abstract: Until recently, research on artificial neural networks was largely restricted
to systems with only two types of variable: Neural activities that represent
the current or recent input and weights that learn to capture regularities
among inputs, outputs and payoffs. There is no good reason for this
restriction. Synapses have dynamics at many different time-scales and this
suggests that artificial neural networks might benefit from variables that
change slower than activities but much faster than the standard weights. These
"fast weights" can be used to store temporary memories of the recent past and
they provide a neurally plausible way of implementing the type of attention to
the past that has recently proved very helpful in sequence-to-sequence models.
By using fast weights we can avoid the need to store copies of neural activity
patterns.