Striving for Simplicity: The All Convolutional Net on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Striving for Simplicity: The All Convolutional Net
Jost Tobias Springenberg and Alexey Dosovitskiy and Thomas Brox and Martin Riedmiller
arXiv e-Print archive - 2014 via Local arXiv
Keywords: cs.LG, cs.CV, cs.NE
more

Summaries/Notes 2

[link] Summary by Martin Thoma 7 years ago

A paper in the intersection for Computer Vision and Machine Learning. They simplify networks by replacing max-pooling by convolutions with higher stride.

* introduce a new variant of the "deconvolution approach" for visualizing features learned by CNNs, which can be applied to a broader range of network structures than existing approaches


## Datasets

competitive or state of the art performance on several object recognition datasets (CIFAR-10, CIFAR-100, ImageNet)

Your comment:

[link] Summary by Abhishek Das 6 years ago

This paper simplifies the convolutional network proposed
by Alex Krizhevsky by replacing max-pooling with strided
convolutions (under the assumption that max-pooling is
required only for dimensionality reduction). They also
propose a novel technique for visualizing representations
learnt by intermediate layers that produces nicer visualizations
in input pixel space than DeconvNet (Zeiler et al) and Saliency
map (Simonyan at al) approaches.

## Strengths

- Their model performs at par or better than the original AlexNet formulation.
    - Max-pooling replaced by convolution with stride 2
    - Fully-connected layers replaced by 1x1 convolutions and global averaging + softmax
    - Smaller filter size (same intuition as VGGNet paper)
- Combining the DeconvNet (Zeiler et al.) and backpropagation (Simonyan et al.) approaches
at the ReLU operator (which is the only point of difference) by masking out values where at
least one of input activation or output reconstruction is negative (guided backprop) is neat
and leads to nice visualizations.

## Weaknesses / Notes

- Saliency maps generated from guided backpropagation definitely look much better
as compared to DeconvNet visualizations and saliency maps from Simonyan et al's paper.
It works better probably because the negative saliency values only arise from the very
first convolution, since negative error signals are never propagated back through the
non-linearities.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private