Fixed Point Quantization of Deep Convolutional Networks
Lin, Darryl Dexu
and
Talathi, Sachin S.
and
Annapureddy, V. Sreekanth
arXiv e-Print archive - 2015 via Local Bibsonomy
Keywords:
dblp
This paper proposes a layers wise adaptive depth quantization of DCNs, giving an better tradeoff of error rate/ memory requirement than the fixed bit width across layers.
The authors describe an optimization problem for determining the bit-width for different layers of DCNs for reducing model size and required computation.
This paper builds further upon the line of research that tries to represent neural network weights and outputs with lower bit-depths. This way, NN weights will take less memory/space and can speed up implementations of NNs (on GPUs or more specialized hardware).