_Objective:_ Predict labels using a very large dataset with noisy labels and a much smaller (3 orders of magnitude) dataset with human-verified annotations.
_Dataset:_ [Open image](https://research.googleblog.com/2016/09/introducing-open-images-dataset.html)
Contrary to other approaches they use the clean labels, the noisy labels but also image features. They basically train 3 networks:
1. A feature extractor for the image.
2. A label Cleaning Network that predicts to learn verified labels from noisy labels + image feature.
3. An image classifier that predicts using just the image.
[![screen shot 2017-04-12 at 11 10 56 am](https://cloud.githubusercontent.com/assets/17261080/24950258/c4764106-1f70-11e7-82e4-c1111ffc089e.png)](https://cloud.githubusercontent.com/assets/17261080/24950258/c4764106-1f70-11e7-82e4-c1111ffc089e.png)
Overall better performance but not breath-taking improvement: from `AP 83.832 / MAP 61.82` for a NN trained only on labels to `AP 87.67 / MAP 62.38` with their approach.