[link]
Summary by MarvMind 7 years ago
The paper introduces an approach to model external memory in a differentiable way. Memory is modeled as $N \times W$ matrix. The memory has $N$ independent storage location each being able to store a datum of length $W$.
### DNC Architecture
A controller network is trained to utilize the memory. Memory access is modeled in cycles. At each time-step $t$ the network emits read and write command as part of its output. The commands are than processed, read data is given to the network at time-step $t+1$ as part of its input. Any deep learning network architecture can be used as Controller network (e.g. standard feed-forward CNN). The paper utilizes a deep LSTM architecture is used as controller network.
https://storage.googleapis.com/deepmind-live-cms/images/dnc_figure1.width-1500.png
### Memory Interaction
The control commands allow the network to interact with the data in three different ways:
1. Content Lookup: access is controlled by how closely a given location matches a predefined key.
2. Sequential read access: For each read vector $v$ the network receives as input at time $t$ it can access the data which was written directly after $v$.
3. Usage based write access: A "usage" vector $u \in [0,1]^N$ models the importance of each location. The network can choose to write data based on the lowest usage level. $u$ can be decreased ("erased memory") at each time-step.
#### Differentiable Operations
All memory operations are modeled in a differentiable way, so that the entire model can be trained end-to-end using gradient descend. To do this all read commands are mapped to a soft read vector $w^r \in [0,1]^N$, such that $\sum^N_{i=1} w^r_i = 1$. The read vector $r$ is then defined as weighted sum over the rows of the memory: $r = \sum^N_{i=1} M[i, \cdot] w^r_i = 1$.
### Experiments
The network was trained to perform a variety of different, memory intensive tasks. The results show, that the network is able to learn to take advantage of the external memory.
https://www.youtube.com/watch?v=B9U8sI7TcMY
more
less