SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size
Iandola, Forrest N.
and
Moskewicz, Matthew W.
and
Ashraf, Khalid
and
Han, Song
and
Dally, William J.
and
Keutzer, Kurt
arXiv e-Print archive - 2016 via Local Bibsonomy
Keywords:
dblp
This paper is about the reduction of model parameters while maintaining (most) of the models accuracy.
The paper gives a nice overview over some key findings about CNNs. One part that is especially interesting is "2.4. Neural Network Design Space Exploration".
## Model compression
Key ideas for model compression are:
* singular value decomposition (SVD)
* replace parameters that are below a certain threshold with zeros to form a sparse matrix
* combining Network Pruning with quantization (to 8 bits or less)
* huffman encoding (Deep Compression)
Ideas used by this paper are
* Replacing 3x3 filters by 1x1 filters
* Decrease the number of input channels by using **squeeze layers**
One key idea to maintain high accuracy is to downsample late in the network. This means close to the input layer, the layer parameters have stride = 1, later they have stride > 1.
## Fire module
A Fire module is a squeeze convolution layer (which has only $n_1$ 1x1 filters), feeding into an expand layer that has a mix of $n_2$ 1x1 and $n_3$ 3x3 convolution filters. It is chosen
$$n_1 < n_2 + n_3$$
(Why?)
(to be continued)