Learning both Weights and Connections for Efficient Neural Networks
Song Han
and
Jeff Pool
and
John Tran
and
William J. Dally
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
cs.NE, cs.CV, cs.LG
First published: 2015/06/08 (9 years ago) Abstract: Neural networks are both computationally intensive and memory intensive,
making them difficult to deploy on embedded systems. Also, conventional
networks fix the architecture before training starts; as a result, training
cannot improve the architecture. To address these limitations, we describe a
method to reduce the storage and computation required by neural networks by an
order of magnitude without affecting their accuracy by learning only the
important connections. Our method prunes redundant connections using a
three-step method. First, we train the network to learn which connections are
important. Next, we prune the unimportant connections. Finally, we retrain the
network to fine tune the weights of the remaining connections. On the ImageNet
dataset, our method reduced the number of parameters of AlexNet by a factor of
9x, from 61 million to 6.7 million, without incurring accuracy loss. Similar
experiments with VGG-16 found that the number of parameters can be reduced by
13x, from 138 million to 10.3 million, again with no loss of accuracy.