Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
arXiv e-Print archive - 2015 via Local arXiv
First published: 2015/11/19 (7 years ago) Abstract: In recent years, supervised learning with convolutional networks (CNNs) has
seen huge adoption in computer vision applications. Comparatively, unsupervised
learning with CNNs has received less attention. In this work we hope to help
bridge the gap between the success of CNNs for supervised learning and
unsupervised learning. We introduce a class of CNNs called deep convolutional
generative adversarial networks (DCGANs), that have certain architectural
constraints, and demonstrate that they are a strong candidate for unsupervised
learning. Training on various image datasets, we show convincing evidence that
our deep convolutional adversarial pair learns a hierarchy of representations
from object parts to scenes in both the generator and discriminator.
Additionally, we use the learned features for novel tasks - demonstrating their
applicability as general image representations.
_Objective:_ Propose a more stable set of architectures for training GAN and show that they learn good representations of images for supervised learning and generative modeling.
_Dataset:_ [LSUN](http://www.yf.io/p/lsun) and [ImageNet 1k](www.image-net.org/).
Below are the guidelines for making DCGANs.
[![screen shot 2017-04-24 at 10 58 17 am](https://cloud.githubusercontent.com/assets/17261080/25329644/f3885f7c-28dc-11e7-8895-051124c8ff6c.png)](https://cloud.githubusercontent.com/assets/17261080/25329644/f3885f7c-28dc-11e7-8895-051124c8ff6c.png)
And here is a sample network:
[![screen shot 2017-04-24 at 10 57 54 am](https://cloud.githubusercontent.com/assets/17261080/25329634/e9c14abc-28dc-11e7-8bed-068f7f7bc78d.png)](https://cloud.githubusercontent.com/assets/17261080/25329634/e9c14abc-28dc-11e7-8bed-068f7f7bc78d.png)
A tensorflow implementation can be found [here](https://github.com/carpedm20/DCGAN-tensorflow) along with an [online demo](https://carpedm20.github.io/faces/).
Quite interesting especially concerning the structure learned in the Z-space and how this can be used for interpolation or object removal, see the example that is shown everywhere:
[![screen shot 2017-04-24 at 11 20 03 am](https://cloud.githubusercontent.com/assets/17261080/25330458/080b6b4e-28e0-11e7-9ab6-ce58ef5b5562.png)](https://cloud.githubusercontent.com/assets/17261080/25330458/080b6b4e-28e0-11e7-9ab6-ce58ef5b5562.png)
Nonetheless the network is still generating small images (32x32).