Systematic evaluation of CNN advances on the ImageNet on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Systematic evaluation of CNN advances on the ImageNet
Dmytro Mishkin and Nikolay Sergievskiy and Jiri Matas
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.NE, cs.CV, cs.LG
more

Summaries/Notes 1

[link] Summary by Dmytro Mishkin 9 years ago

Authors test different variant of CNN architectures, non-linearities, poolings, etc. on ImageNet.

Summary:
-  use ELU non-linearity without batchnorm or ReLU with it.
-  apply a learned colorspace transformation of RGB (2 layers of 1x1 convolution ).
-  use the linear learning rate decay policy.
-  use a sum of the average and max pooling layers.
-  use mini-batch size around 128 or 256. If this is too big for your GPU,
decrease the learning rate proportionally to the batch size.
- use fully-connected layers as convolutional and average the predictions for
the final decision.
- when investing in increasing training set size, check if a plateau has not
been reach.
- cleanliness of the data is more important then the size.
- if you cannot increase the input image size, reduce the stride in the consequent
layers, it has roughly the same effect.
- if your network has a complex and highly optimized architecture, like e.g.
GoogLeNet, be careful with modifications.

Your comment: