[link]
Summary by Léo Paillier 6 years ago
_Objective:_ Transfer visual attribute (color, tone, texture, and style, etc) between two semantically-meaningful images such as a picture and a sketch.
## Inner workings:
### Image analogy
An image analogy A:A′::B:B′ is a relation where:
* B′ relates to B in the same way as A′ relates to A
* A and A′ are in pixel-wise correspondences
* B and B′ are in pixel-wise correspondences
In this paper only a source image A and an example image B′ are given, and both A′ and B represent latent images to be estimated.
[![screen shot 2017-05-18 at 10 43 48 am](https://cloud.githubusercontent.com/assets/17261080/26193907/f080e212-3bb6-11e7-9441-7b255e4219f5.png)](https://cloud.githubusercontent.com/assets/17261080/26193907/f080e212-3bb6-11e7-9441-7b255e4219f5.png)
### Dense correspondence
In order to find dense correspondences between two images they use features from previously trained CNN (VGG-19) and retrieve all the ReLU layers.
The mapping is divided in two sub-mappings that are easier to compute, first a visual attribute transformation and then a space transformation.
[![screen shot 2017-05-18 at 11 04 58 am](https://cloud.githubusercontent.com/assets/17261080/26194835/03ccd94a-3bba-11e7-93ca-9420d4d96162.png)](https://cloud.githubusercontent.com/assets/17261080/26194835/03ccd94a-3bba-11e7-93ca-9420d4d96162.png)
## Architecture:
The algorithm proceeds as follow:
1. Compute features at each layer for the input image using a pre-trained CNN and initialize feature maps of latent images with coarsest layer.
2. For said layer compute a forward and reverse nearest-neighbor field (NNF, basically an offset field).
3. Use this NNF with the feature of the input current layer to compute the features of the latent images.
4. Upsample the NNF and use it as the initialization for the NNF of the next layer.
[![screen shot 2017-05-18 at 11 14 33 am](https://cloud.githubusercontent.com/assets/17261080/26195178/35277e0e-3bbb-11e7-82ce-037466314640.png)](https://cloud.githubusercontent.com/assets/17261080/26195178/35277e0e-3bbb-11e7-82ce-037466314640.png)
## Results:
Impressive quality on all type of visual transfer but veryyyyy slow! (~3min on GPUs for one image).
[![screen shot 2017-05-18 at 11 36 47 am](https://cloud.githubusercontent.com/assets/17261080/26196151/54ef423c-3bbe-11e7-9433-b29be5091fae.png)](https://cloud.githubusercontent.com/assets/17261080/26196151/54ef423c-3bbe-11e7-9433-b29be5091fae.png)
more
less