StreetStyle: Exploring world-wide clothing styles from millions of photos
Matzen, Kevin
and
Bala, Kavita
and
Snavely, Noah
arXiv e-Print archive - 2017 via Local Bibsonomy
Keywords:
dblp
_Objective:_ Analyze large scale dataset of fashion images to discover visually consistent style clusters.
* _Dataset:_ StreetStye-27K.
* _Code:_ demo [here](http://streetstyle.cs.cornell.edu/)
## New dataset: StreetStye-27K
1. **Photos (100 million)**: from Instagram using the [API](https://www.instagram.com/developer/) to retrieve images with the correct location and time.
2. **People (14.5 million)**: they run two algorithms to normalize the body position in the image:
* [Face++](http://www.faceplusplus.com/) to detect and localize faces.
* [Deformable Part Model](http://people.cs.uchicago.edu/%7Erbg/latent-release5/) to estimate the visibility of the rest of the body.
3. **Clothing annotations (27K)**: Amazon Mechanical Turk with quality control. 4000$ for the whole dataset.
## Architecture:
Usual GoogLeNet but they use [Isotonice Regression](http://fastml.com/classifier-calibration-with-platts-scaling-and-isotonic-regression/) to correct the bias.
## Unsupervised clustering:
They proceed as follow:
1. Compute the features embedding for a subset of the overall dataset selected to represent location and time.
2. Apply L2 normalization.
3. Use PCA to find the vector representing 90% of the variance (165 here).
4. Cluster them using a [GMM](https://en.wikipedia.org/wiki/Mixture_model#Multivariate_Gaussian_mixture_model) with 400 mixtures which represent the clusters.
They compute fashion clusters for city or bigger entities:
[![screen shot 2017-06-15 at 12 04 06 pm](https://user-images.githubusercontent.com/17261080/27176447-d33fc2dc-51c2-11e7-9191-dbf972ee96a1.png)](https://user-images.githubusercontent.com/17261080/27176447-d33fc2dc-51c2-11e7-9191-dbf972ee96a1.png)
## Results:
Pretty standard techniques but all patched together to produce interesting visualizations.