Disentangling factors of variation in deep representations using adversarial training
Michael Mathieu
and
Junbo Zhao
and
Pablo Sprechmann
and
Aditya Ramesh
and
Yann LeCun
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.LG, stat.ML
First published: 2016/11/10 (8 years ago) Abstract: We introduce a conditional generative model for learning to disentangle the
hidden factors of variation within a set of labeled observations, and separate
them into complementary codes. One code summarizes the specified factors of
variation associated with the labels. The other summarizes the remaining
unspecified variability. During training, the only available source of
supervision comes from our ability to distinguish among different observations
belonging to the same class. Examples of such observations include images of a
set of labeled objects captured at different viewpoints, or recordings of set
of speakers dictating multiple phrases. In both instances, the intra-class
diversity is the source of the unspecified factors of variation: each object is
observed at multiple viewpoints, and each speaker dictates multiple phrases.
Learning to disentangle the specified factors from the unspecified ones becomes
easier when strong supervision is possible. Suppose that during training, we
have access to pairs of images, where each pair shows two different objects
captured from the same viewpoint. This source of alignment allows us to solve
our task using existing methods. However, labels for the unspecified factors
are usually unavailable in realistic scenarios where data acquisition is not
strictly controlled. We address the problem of disentanglement in this more
general setting by combining deep convolutional autoencoders with a form of
adversarial training. Both factors of variation are implicitly captured in the
organization of the learned embedding space, and can be used for solving
single-image analogies. Experimental results on synthetic and real datasets
show that the proposed method is capable of generalizing to unseen classes and
intra-class variabilities.