[link]
Summary by robromijnders 6 years ago
# Semantic Segmentation using Adversarial networks
## Luc, Couprie, Chintala, Verbeek, 2016
* The paper aims to improve segmentation performance (IoU) by extending the network
* The authors derive intuition from GAN's, where a game is played between generator and discriminator.
* In this work, the game works as follows: a segmentation network maps an image WxHx3 to a label map WxHxC. a discriminator CNN is equipped with the task to discriminate the generated label maps from the ground truth. It is an adversarial game, because the segmentor aims for _more real_ label maps and the discriminator aims to distuinguish them from ground truth.
* The discriminator is a CNN that maps from HxWxC to a binary label.
* Section 3.2 outlines how to feed the label maps in three ways
* __Basic__ where the label maps are concatenated to the image and fed to the discriminator. Actually, the authors observe that leaving the image out does not change performance. So they end up feeding only the label maps for _basic_
* __Product__ where the label maps and input are multiplied, leading to an input of 3C channels
* __Scaling__ which resembles basic, but the one-hot distribution is perturbed a bit. This avoids the discriminator from trivially detecting the entropy rather than anything useful
* The discriminator is constructed with two axes of variation, leading to 4 architectures
* __FOV__: either a field of view of 18x18 or 34x34 over the label map
* __light__: an architecture with more or less capacity, e.g. number of channels
* The paper shows some fair result on the Stanford dataset, but keep in mind that it only contains 700 images
* The results in the Pascal dataset are minor, with the IoU improving from 71.8 to 72.0.
* Authors tried to pretrain the adversary, but they found this led to instable training. They end up training in an alternating scheme between segmentor and discriminator. They found that slow alternations work best.
more
less