Learning Factored Representations in a Deep Mixture of Experts on ShortScience.org

arxiv.org
scholar.google.com

Learning Factored Representations in a Deep Mixture of Experts
Eigen, David and Ranzato, Marc'Aurelio and Sutskever, Ilya
arXiv e-Print archive - 2013 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by Open Review 9 years ago

This paper extends the mixture-of-experts (MoE) model by stacking several blocks of the MoEs to form a deep MoE. In this model, each mixture weight is implemented with a gating network. The mixtures at each block is different. The whole deep MoE is trained jointly using the stochastic gradient descent algorithm. The motivation of the work is to reduce the decoding time by exploiting the structure imposed in the MoE model. The model was evaluated on the MNIST and speech monophone classification tasks.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private