Training Region-based Object Detectors with Online Hard Example Mining on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Training Region-based Object Detectors with Online Hard Example Mining
Abhinav Shrivastava and Abhinav Gupta and Ross Girshick
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.CV, cs.LG
more

Summaries/Notes 1

[link] Summary by RyanDsouza 7 years ago

The problem statement this paper tries to address is that the training set is distinguished by a large imbalance between the number of foreground examples and background examples-To make the point concrete cases like sliding window object detectors like deformable parts model, the imbalance may be as extreme as 100,000 background examples to one annotated foreground example.

Before i proceed to give you the details of Hard Example mining, i just want to note that HEM in its essence is mostly while training you sort your losses and train your model on the most difficult examples which mostly means the ones with the most loss.(An extension to this can be found in the paper Focal Loss). This is a simple but powerful technique.

So taking this as out background,The authors propose a simple but effective method to train an Fast-RCNN.
Their approach is as follows,
1. For an input image at SGD iteration t, they first compute a convolution feature map using the conv-Network
2. The ROI Network uses this feature map and all the input ROI's to do a forward pass
3. Hard examples are sorted by loss and taking the B/N examples for which the current network performs worse.(Here B is batch size and N is Number of examples)

4. While doing this, The researchers notice that Co-located ROI's with high overlap are likely to have co-related losses. Also If you notice Overlapping ROI's will project onto the mostly the same region in the Conv-feature map because the feature map is a denser/smaller representation of the feature map.So this might lead to loss double counting.To deal with this They use standard Non-Maximum Supression.

5. Now how NMS works here is, It iteratively selects the ROI with the highest loss and removes all lower loss ROI's that have high overlap with the selected region.Here they use a IOU threshold of 0.7

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private