[link]
Summary by kdubovikov 7 years ago
The paper proposes a reusable neural network module to `reason about the relations between entities and their properties`:
$$ RN(O) = f_\phi \left( \sum_{i,j} g_\theta(o_i, o_j) \right), $$
- $O$ is a set of input objects $\{o_1, o_2, ..., o_n\}, o_i \in R^m$
- $g_\theta$ is a neural network (MLP) which approximates object-to-object relation function
- $f_\phi $ is a neural network (MLP) which transforms summed pairwise object-to-object relations to some desired output
RN's operate on sets (due to summation in the formula) and thus are invariant to the order of objects in the input.
In terms of architecture, RN module is used at the tail of a neural network taking input objects in form of CNN or LSTM embeddings.
This work is evaluated on several tasks where it achieves reasonably good (even superhuman) performance:
- CLEVR and Sort-of-CLEVR - question answering about an image
- bAbI - text based question answering
- Dynamic physical system - MuJoCo simulations with physical relation between entities
more
less