Designing Neural Network Architectures using Reinforcement Learning
Bowen Baker
and
Otkrist Gupta
and
Nikhil Naik
and
Ramesh Raskar
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.LG
First published: 2016/11/07 (7 years ago) Abstract: At present, designing convolutional neural network (CNN) architectures
requires both human expertise and labor. New architectures are handcrafted by
careful experimentation or modified from a handful of existing networks. We
propose a meta-modelling approach based on reinforcement learning to
automatically generate high-performing CNN architectures for a given learning
task. The learning agent is trained to sequentially choose CNN layers using
Q-learning with an $\epsilon$-greedy exploration strategy and experience
replay. The agent explores a large but finite space of possible architectures
and iteratively discovers designs with improved performance on the learning
task. On image classification benchmarks, the agent-designed networks
(consisting of only standard convolution, pooling, and fully-connected layers)
beat existing networks designed with the same layer types and are competitive
against the state-of-the-art methods that use more complex layer types. We also
outperform existing network design meta-modelling approaches on image
classification.
## Ideas
* Find CNN topology with Q-learning and $\varepsilon$-greedy exploration and experience replay
## Evaluation
The authors seem not to know DenseNets
* CIFAR-10: 6.92 % accuracy ([SOTA](https://martin-thoma.com/sota/#image-classification) is 3.46 % - not mentioned in the paper)
* SVHN: 2.06 % accuracy ([SOTA](https://martin-thoma.com/sota/#image-classification) is 1.59% - not mentioned in the paper)
* MNIST: 0.31 % ([SOTA](https://martin-thoma.com/sota/#image-classification) is 0.21 % - not mentioned in the paper)
* CIFAR-100: 27.14 % accuracy ([SOTA](https://martin-thoma.com/sota/#image-classification) is 17.18 % - not mentioned in the paper)
## Related Work
* Google: [Neural Architecture Search with Reinforcement Learning](https://arxiv.org/abs/1611.01578) ([summary](http://www.shortscience.org/paper?bibtexKey=journals/corr/1611.01578#martinthoma))