Stochastic Backpropagation through Mixture Density Distributions
Alex Graves
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.NE
First published: 2016/07/19 (8 years ago) Abstract: The ability to backpropagate stochastic gradients through continuous latent
distributions has been crucial to the emergence of variational autoencoders and
stochastic gradient variational Bayes. The key ingredient is an unbiased and
low-variance way of estimating gradients with respect to distribution
parameters from gradients evaluated at distribution samples. The
"reparameterization trick" provides a class of transforms yielding such
estimators for many continuous distributions, including the Gaussian and other
members of the location-scale family. However the trick does not readily extend
to mixture density models, due to the difficulty of reparameterizing the
discrete distribution over mixture weights. This report describes an
alternative transform, applicable to any continuous multivariate distribution
with a differentiable density function from which samples can be drawn, and
uses it to derive an unbiased estimator for mixture density weight derivatives.
Combined with the reparameterization trick applied to the individual mixture
components, this estimator makes it straightforward to train variational
autoencoders with mixture-distributed latent variables, or to perform
stochastic variational inference with a mixture density variational posterior.