Designing Neural Network Architectures using Reinforcement Learning
Bowen Baker
and
Otkrist Gupta
and
Nikhil Naik
and
Ramesh Raskar
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.LG
First published: 2016/11/07 (7 years ago) Abstract: At present, designing convolutional neural network (CNN) architectures
requires both human expertise and labor. New architectures are handcrafted by
careful experimentation or modified from a handful of existing networks. We
propose a meta-modelling approach based on reinforcement learning to
automatically generate high-performing CNN architectures for a given learning
task. The learning agent is trained to sequentially choose CNN layers using
Q-learning with an $\epsilon$-greedy exploration strategy and experience
replay. The agent explores a large but finite space of possible architectures
and iteratively discovers designs with improved performance on the learning
task. On image classification benchmarks, the agent-designed networks
(consisting of only standard convolution, pooling, and fully-connected layers)
beat existing networks designed with the same layer types and are competitive
against the state-of-the-art methods that use more complex layer types. We also
outperform existing network design meta-modelling approaches on image
classification.