Summary by Two Minute Papers 6 years ago
This technique is a combination of two powerful machine learning algorithms:
- convolutional neural networks are excellent at image classification, i.e., finding out what is seen on an input image,
- recurrent neural networks that are capable of processing a sequence of inputs and outputs, therefore it can create sentences of what is seen on the image.
Combining these two techniques makes it possible for a computer to describe in a sentence what is seen on an input image.