Big Neural Networks Waste Capacity on ShortScience.org

arxiv.org
scholar.google.com

Big Neural Networks Waste Capacity
Dauphin, Yann and Bengio, Yoshua
arXiv e-Print archive - 2013 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 2

[link] Summary by Martin Thoma 7 years ago

The paper 'Big Neural Networks Waste Capacity' recognizes that adding more layer / parameters does not improve accuracy. When reading this paper, one should bear in mind that it was written well before [Deep Residual Learning for Image Recognition](http://www.shortscience.org/paper?bibtexKey=journals/corr/HeZRS15) or DenseNets.

In the experiments, they applied MLPs to SIFT features of ImageNet LSVRC-2010.

**Do not read this paper**. Instead, you might want to read the "Deep Residual Learning for Image Recognition". It makes the same point, but clearer and offers a solution to the underfitting problem.


## Criticism

I don't understand why they write about k-means.

> Assuming minimal error in the human labelling of the dataset, it should be possible to reach errors close to 0%.

For ImageNet, the human labeling error is estimated at about 5% (I can't find the source for that, though)


> Improvements on ImageNet are thought to be a good proxy for progress in object recognition (Deng et al., 2009).

ImageNet images are very different from "typical web images" like the [100 million images Flickr dataset](http://yahoolabs.tumblr.com/post/89783581601/one-hundred-million-creative-commons-flickr-images-for).

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private