First published: 2016/06/07 (8 years ago) Abstract: An efficient learner is one who reuses what they already know to tackle a new
problem. For a machine learner, this means understanding the similarities
amongst datasets. In order to do this, one must take seriously the idea of
working with datasets, rather than datapoints, as the key objects to model.
Towards this goal, we demonstrate an extension of a variational autoencoder
that can learn a method for computing representations, or statistics, of
datasets in an unsupervised fashion. The network is trained to produce
statistics that encapsulate a generative model for each dataset. Hence the
network enables efficient learning from new datasets for both unsupervised and
supervised tasks. We show that we are able to learn statistics that can be used
for: clustering datasets, transferring generative models to new datasets,
selecting representative samples of datasets and classifying previously unseen
classes.