YOLO9000: Better, Faster, Stronger on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

YOLO9000: Better, Faster, Stronger
Joseph Redmon and Ali Farhadi
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.CV
more

Summaries/Notes 3

[link] Summary by Léo Paillier 7 years ago

_Objective:_ Train on both classification and detection image to make a better faster and stronger detector.

_Dataset:_ [ImageNet](http://www.image-net.org/), [COCO](http://mscoco.org/) and [WordNet](https://wordnet.princeton.edu/).


## Architecture:

Apart from amelioration such as batch norm or other general tweaking the real improvements come from:

1.  Using both a classification dataset and a detection dataset at the same time.
2.  Replacing the usual final soft-max layer (which assumes that all labels are mutually exclusive) with a WordTree label hierarchy base on WordNet which enables the network to predict `dog` even if it doesn't know if it's a `Fox Terrier`.

[![screen shot 2017-04-12 at 7 24 28 pm](https://cloud.githubusercontent.com/assets/17261080/24970727/b7abaf02-1fb5-11e7-8b78-2a430a861cbd.png)](https://cloud.githubusercontent.com/assets/17261080/24970727/b7abaf02-1fb5-11e7-8b78-2a430a861cbd.png)

## Results:

State of the art results at full resolution and possibility to lower performance to gain in computation time.

[![screen shot 2017-04-12 at 7 31 26 pm](https://cloud.githubusercontent.com/assets/17261080/24971010/a51556f8-1fb6-11e7-9289-fc277b182686.png)](https://cloud.githubusercontent.com/assets/17261080/24971010/a51556f8-1fb6-11e7-9289-fc277b182686.png)

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private