First published: 2017/04/20 (7 years ago) Abstract: Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The
key idea of Softmax GAN is to replace the classification loss in the original
GAN with a softmax cross-entropy loss in the sample space of one single batch.
In the adversarial learning of $N$ real training samples and $M$ generated
samples, the target of discriminator training is to distribute all the
probability mass to the real samples, each with probability $\frac{1}{M}$, and
distribute zero probability to generated data. In the generator training phase,
the target is to assign equal probability to all data points in the batch, each
with probability $\frac{1}{M+N}$. While the original GAN is closely related to
Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance
Sampling version of GAN. We futher demonstrate with experiments that this
simple change stabilizes GAN training.
_Objective:_ Replace the usual GAN loss with a softmax croos-entropy loss to stabilize GAN training.
_Dataset:_ [CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html)
## Inner working:
Linked to recent work such as WGAN or Loss-Sensitive GAN that focus on objective functions with non-vanishing gradients to avoid the situation where the discriminator `D` becomes too good and the gradient vanishes.
Thus they first introduce two targets for the discriminator `D` and the generator `G`:
[![screen shot 2017-04-24 at 6 18 11 pm](https://cloud.githubusercontent.com/assets/17261080/25347232/767049bc-291a-11e7-906e-c19a92bb7431.png)](https://cloud.githubusercontent.com/assets/17261080/25347232/767049bc-291a-11e7-906e-c19a92bb7431.png)
[![screen shot 2017-04-24 at 6 18 24 pm](https://cloud.githubusercontent.com/assets/17261080/25347233/7670ff60-291a-11e7-974f-83eb9269d238.png)](https://cloud.githubusercontent.com/assets/17261080/25347233/7670ff60-291a-11e7-974f-83eb9269d238.png)
And then the two new losses:
[![screen shot 2017-04-24 at 6 19 50 pm](https://cloud.githubusercontent.com/assets/17261080/25347275/a303aa0a-291a-11e7-86b4-abd42c83d4a8.png)](https://cloud.githubusercontent.com/assets/17261080/25347275/a303aa0a-291a-11e7-86b4-abd42c83d4a8.png)
[![screen shot 2017-04-24 at 6 19 55 pm](https://cloud.githubusercontent.com/assets/17261080/25347276/a307bc6c-291a-11e7-98b3-cbd7182090cd.png)](https://cloud.githubusercontent.com/assets/17261080/25347276/a307bc6c-291a-11e7-98b3-cbd7182090cd.png)
## Architecture:
They use the DCGAN architecture and simply change the loss and remove the batch normalization and other empirical techniques used to stabilize training.
They show that the soft-max GAN is still robust to training.