Training Region-based Object Detectors with Online Hard Example Mining
Abhinav Shrivastava
and
Abhinav Gupta
and
Ross Girshick
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CV, cs.LG
First published: 2016/04/12 (8 years ago) Abstract: The field of object detection has made significant advances riding on the
wave of region-based ConvNets, but their training procedure still includes many
heuristics and hyperparameters that are costly to tune. We present a simple yet
surprisingly effective online hard example mining (OHEM) algorithm for training
region-based ConvNet detectors. Our motivation is the same as it has always
been -- detection datasets contain an overwhelming number of easy examples and
a small number of hard examples. Automatic selection of these hard examples can
make training more effective and efficient. OHEM is a simple and intuitive
algorithm that eliminates several heuristics and hyperparameters in common use.
But more importantly, it yields consistent and significant boosts in detection
performance on benchmarks like PASCAL VOC 2007 and 2012. Its effectiveness
increases as datasets become larger and more difficult, as demonstrated by the
results on the MS COCO dataset. Moreover, combined with complementary advances
in the field, OHEM leads to state-of-the-art results of 78.9% and 76.3% mAP on
PASCAL VOC 2007 and 2012 respectively.