Densely Connected Convolutional Networks
Gao Huang
and
Zhuang Liu
and
Kilian Q. Weinberger
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CV, cs.LG
First published: 2016/08/25 (8 years ago) Abstract: Recent work has shown that convolutional networks can be substantially
deeper, more accurate and efficient to train if they contain shorter
connections between layers close to the input and those close to the output. In
this paper we embrace this observation and introduce the Dense Convolutional
Network (DenseNet), where each layer is directly connected to every other layer
in a feed-forward fashion. Whereas traditional convolutional networks with L
layers have L connections, one between each layer and its subsequent layer
(treating the input as layer 0), our network has L(L+1)/2 direct connections.
For each layer, the feature maps of all preceding layers are treated as
separate inputs whereas its own feature maps are passed on as inputs to all
subsequent layers. Our proposed connectivity pattern has several compelling
advantages: it alleviates the vanishing gradient problem and strengthens
feature propagation; despite the increase in connections, it encourages feature
reuse and leads to a substantial reduction of parameters; its models tend to
generalize surprisingly well. We evaluate our proposed architecture on five
highly competitive object recognition benchmark tasks. The DenseNet obtains
significant improvements over the state-of-the-art on all five of them (e.g.,
yielding 3.74% test error on CIFAR-10, 19.25% on CIFAR-100 and 1.59% on SVHN).