Generating Images with Perceptual Similarity Metrics based on Deep Networks on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Generating Images with Perceptual Similarity Metrics based on Deep Networks
Alexey Dosovitskiy and Thomas Brox
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.LG, cs.CV, cs.NE
more

Summaries/Notes 2

[link] Summary by yenchenlin 7 years ago

This paper proposed a class of loss functions applicable to image generation that are based on distance in feature spaces:

$$\mathcal{L} = \lambda_{feat}\mathcal{L}_{feat} + \lambda_{adv}\mathcal{L}_{adv} + \lambda_{img}\mathcal{L}_{img}$$

### Key Points
- Using only l2 loss in image space yields over-smoothed results since it leads to averaging all likely locations of details.
- L_feat measures the distance in suitable feature space and therefore preserves distribution of fine details instead of exact locations.
- Using only L_feat yields bad results since feature representations are contractive. Many non-natural images also mapped to the same feature vector.
- By introducing a natural image prior - GAN, we can make sure that samples lie on the natural image manifold.

### Model

https://i.imgur.com/qNzMwQ6.png

### Exp
- Training Autoencoder
- Generate images using VAE
- Invert feature

### Thought
I think the experiment section is a little complicated to comprehend. However, the proposed loss seems really promising and can be applied to many tasks related to image generation.

### Questions
- Section 4.2 & 4.3 are hard to follow for me, need to pay more attention in the future

Your comment: