First published: 2016/12/25 (7 years ago) Abstract: We introduce YOLO9000, a state-of-the-art, real-time object detection system
that can detect over 9000 object categories. First we propose various
improvements to the YOLO detection method, both novel and drawn from prior
work. The improved model, YOLOv2, is state-of-the-art on standard detection
tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At
40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like
Faster RCNN with ResNet and SSD while still running significantly faster.
Finally we propose a method to jointly train on object detection and
classification. Using this method we train YOLO9000 simultaneously on the COCO
detection dataset and the ImageNet classification dataset. Our joint training
allows YOLO9000 to predict detections for object classes that don't have
labelled detection data. We validate our approach on the ImageNet detection
task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite
only having detection data for 44 of the 200 classes. On the 156 classes not in
COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes;
it predicts detections for more than 9000 different object categories. And it
still runs in real-time.