Deep convolutional neural networks (DCNN) has been a popular model for image classification over the last few years. This paper proposes a DCNN structure, also known as AlexNet, for the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC). To train AlexNet, which has 60 million parameters, this paper uses Rectified Linear Units (ReLU) and multiple GPU to accelerate training. This paper also report that using local response normalization and overlapping pooling can reduce error rate. To prevent over fitting, they suggest data augmentation and apply dropout in the fully connected layer.
The following figure shows the architecture of AlexNet. It contains five convolutional and three fully connected layers. Response-normalization layers follow the first and second convolutional layers. Max-pooling layers follow the first and second response-normalization layers and the fifth convolutional layer.