Dynamic Capacity Networks
Amjad Almahairi
and
Nicolas Ballas
and
Tim Cooijmans
and
Yin Zheng
and
Hugo Larochelle
and
Aaron Courville
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
cs.LG, cs.NE
First published: 2015/11/24 (8 years ago) Abstract: We introduce the Dynamic Capacity Network (DCN), a neural network that can
adaptively assign its capacity across different portions of the input data.
This is achieved by combining modules of two types: low-capacity sub-networks
and high-capacity sub-networks. The low-capacity sub-networks are applied
across most of the input, but also provide a guide to select a few portions of
the input on which to apply the high-capacity sub-networks. The selection is
made using a novel gradient-based attention mechanism, that efficiently
identifies input regions for which the DCN's output is most sensitive and to
which we should devote more capacity. We focus our empirical evaluation on the
Cluttered MNIST and SVHN image datasets. Our findings indicate that DCNs are
able to drastically reduce the number of computations, compared to traditional
convolutional neural networks, while maintaining similar or even better
performance.