Generative Adversarial Networks
Ian J. Goodfellow
and
Jean Pouget-Abadie
and
Mehdi Mirza
and
Bing Xu
and
David Warde-Farley
and
Sherjil Ozair
and
Aaron Courville
and
Yoshua Bengio
arXiv e-Print archive - 2014 via Local arXiv
Keywords:
stat.ML, cs.LG
First published: 2014/06/10 (10 years ago) Abstract: We propose a new framework for estimating generative models via an
adversarial process, in which we simultaneously train two models: a generative
model G that captures the data distribution, and a discriminative model D that
estimates the probability that a sample came from the training data rather than
G. The training procedure for G is to maximize the probability of D making a
mistake. This framework corresponds to a minimax two-player game. In the space
of arbitrary functions G and D, a unique solution exists, with G recovering the
training data distribution and D equal to 1/2 everywhere. In the case where G
and D are defined by multilayer perceptrons, the entire system can be trained
with backpropagation. There is no need for any Markov chains or unrolled
approximate inference networks during either training or generation of samples.
Experiments demonstrate the potential of the framework through qualitative and
quantitative evaluation of the generated samples.
GAN - derive backprop signals through a **competitive process** invovling a pair of networks;
Aim: provide an overview of GANs for signal processing community, drawing on familiar analogies and concepts; point to remaining challenges in theory and applications.
## Introduction
- How to achieve: implicitly modelling high-dimensional distributions of data
- generator receives **no direct access to real images** but error signal from discriminator
- discriminator receives both the synthetic samples and samples drawn from the real images
- G: G(z) -> R^|x|, where z \in R^|z| is a sample from latent space, x \in R^|x| is an image
- D: D(x) -> (0, 1). may not be trained in practice until the generator is optimal
https://i.imgur.com/wOwSXhy.png
## Preliminaries
- objective functions J_G(theta_G;theta_D) and J_D(theta_D;theta_G) are **co-dependent** as they are iteratively updated
- difficulty: hard to construct likelihood functions for high-dimensional, real-world image data