Deep Linear Discriminant Analysis
Matthias Dorfer
and
Rainer Kelz
and
Gerhard Widmer
arXiv e-Print archive - 2015 via Local arXiv
Keywords:
cs.LG
First published: 2015/11/15 (9 years ago) Abstract: We introduce Deep Linear Discriminant Analysis (DeepLDA) which learns
linearly separable latent representations in an end-to-end fashion. Classic LDA
extracts features which preserve class separability and is used for
dimensionality reduction for many classification problems. The central idea of
this paper is to put LDA on top of a deep neural network. This can be seen as a
non-linear extension of classic LDA. Instead of maximizing the likelihood of
target labels for individual samples, we propose an objective function that
pushes the network to produce feature distributions which: (a) have low
variance within the same class and (b) high variance between different classes.
Our objective is derived from the general LDA eigenvalue problem and still
allows to train with stochastic gradient descent and back-propagation. For
evaluation we test our approach on three different benchmark datasets (MNIST,
CIFAR-10 and STL-10). DeepLDA produces competitive results on MNIST and
CIFAR-10 and outperforms a network trained with categorical cross entropy (same
architecture) on a supervised setting of STL-10.
There are 2 implementations for the paper:
1. [Reference Implementation of Deep Linear Discriminant Analysis (DeepLDA)](https://github.com/CPJKU/deep_lda).
2. [VahidooX/DeepLDA](https://github.com/VahidooX/DeepLDA).
[It seems something is wrong with the cost function implemented](https://github.com/VahidooX/DeepLDA/issues/1#issuecomment-392261355).
Also, while they derive the Gradient they didn't verify it and in the implementation use Theano's Auto Grad (While other Auto Grad can't work it out).