#### Problem addressed:
It has been empirically observed that deep representations lead to better mode mixing when sampled using MCMC. The authors present a set of hypotheses as to why this happens and confirm them empirically.
#### Summary:
The paper claims that deep representations (specially from parametric models) disentangle the factors of variations in the raw feature space. This disentangling leads to better ""mode mixing"" during MCMC sampling. For eg., in faces, the factors of variation could be identity-pose-illumination. If the higher layer learns these features then changing the representation in this space starting from a ""valid"" point would lead to changes in each of these factors directly and hence will produce ""valid"" images, which in the original feature space would be far apart; thus better mode mixing. This hypothesis is explained using 2 additional ones: (a) the manifold structure of the ""valid"" data is flattened in the higher layer space, and (b) the fraction of total volume occupied by high probability (valid) points is larger in the higher layer space. While (a) should lead to better interpolation in higher layer space, (b) should lead to more valid points in a parzen window around any known sample. These are confirmed experimentally.
#### Novelty:
novel intuitions why deep representations are good for generative modeling.
#### Drawbacks:
no theoretical justification
#### Datasets:
MNIST, Toronto Face dataset (TFD)
#### Additional remarks:
used DBN and Deep CAE for experiments on the datasets
#### Presenter:
Devansh Arpit