Residual Networks of Residual Networks: Multilevel Residual Networks
Ke Zhang
and
Miao Sun
and
Tony X. Han
and
Xingfang Yuan
and
Liru Guo
and
Tao Liu
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CV
First published: 2016/08/09 (8 years ago) Abstract: Residual networks family with hundreds or even thousands of layers dominate
major image recognition tasks, but building a network by simply stacking
residual blocks inevitably limits its optimization ability. This paper proposes
a novel residual-network architecture, Residual networks of Residual networks
(RoR), to dig the optimization ability of residual networks. RoR substitutes
optimizing residual mapping of residual mapping for optimizing original
residual mapping, in particular, adding level-wise shortcut connections upon
original residual networks, to promote the learning capability of residual
networks. More importantly, RoR can be applied to various kinds of residual
networks (Pre-ResNets and WRN) and significantly boost their performance. Our
experiments demonstrate the effectiveness and versatility of RoR, where it
achieves the best performance in all residual-network-like structures. Our
RoR-3-WRN58-4 models achieve new state-of-the-art results on CIFAR-10,
CIFAR-100 and SVHN, with test errors 3.77%, 19.73% and 1.59% respectively.
These results outperform 1001-layer Pre-ResNets by 18.4% on CIFAR-10 and 13.1%
on CIFAR-100.