Unsupervised Learning for Physical Interaction through Video Prediction
Finn, Chelsea
and
Goodfellow, Ian J.
and
Levine, Sergey
arXiv e-Print archive - 2016 via Local Bibsonomy
Keywords:
dblp
Problem
----------
Given a video of robot motion, predict future frames of the motion.
Dataset
-----------
1. The authors assembled a new dataset of 59,000 robot interactions involving pushing motions.
2. Human3.6m - video, depth and mocap. action include: sitting, purchasing, waiting...
Approach
------------
* Use LSTMs to "remember" previous frames.
* Predict 10 transformations from previous frame (each approach represents the transformation differently).
* Predict a mask to determine which transformation is applied to which pixel.
The authors suggest 3 models based on this approach:
1. Dynamic Neural Advection
2. Convolutional Dynamic Neural Advection
3. Spatial Transformer Predictors