On Correlation of Features Extracted by Deep Neural Networks on ShortScience.org

doi.org
sci-hub
scholar.google.com

On Correlation of Features Extracted by Deep Neural Networks
Babajide O. Ayinde and Tamer Inanc and Jacek M. Zurada
2019 International Joint Conference on Neural Networks (IJCNN) - 2019 via Local CrossRef
Keywords:

Summaries/Notes 1

[link] Summary by David Stutz 5 years ago

Ayinde et al. study the impact of network architecture and weight initialization on learning redundant features. To empirically estimate the number of redundant features, the authors use an agglomerative clustering approach to cluster features based on their cosine similarity. Essentially, given a set of features, these are merged as long as their (average) cosine similarity is within some threshold $\tau$. Then, this number is compared across network architectures. Figure 1, for example, shows the number of redundant features for different depths of the network and using different activation functions on MNIST. As can be seen, ReLU activations avoid redundant features, while depth of the network usually encourages redundant features.

https://i.imgur.com/ICcCL2u.jpg
Figure 1: Number of redundant features $n_r$ for networks with $n’ = 1000$ hidden units computed suing the given threshold $\tau$ for computing $n_r$. Experiments with different depths and activation functions are shown.

Also find this summary at [davidstutz.de](https://davidstutz.de/category/reading/).

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private