Autoencoding beyond pixels using a learned similarity metric
Anders Boesen Lindbo Larsen
and
Søren Kaae Sønderby
and
Hugo Larochelle
and
Ole Winther
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
cs.LG, cs.CV, stat.ML
First published: 2015/12/31 (8 years ago) Abstract: We present an autoencoder that leverages learned representations to better
measure similarities in data space. By combining a variational autoencoder with
a generative adversarial network we can use learned feature representations in
the GAN discriminator as basis for the VAE reconstruction objective. Thereby,
we replace element-wise errors with feature-wise errors to better capture the
data distribution while offering invariance towards e.g. translation. We apply
our method to images of faces and show that it outperforms VAEs with
element-wise similarity measures in terms of visual fidelity. Moreover, we show
that the method learns an embedding in which high-level abstract visual
features (e.g. wearing glasses) can be modified using simple arithmetic.