Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results.
Antti Tarvainen
and
Harri Valpola
Neural Information Processing Systems Conference - 2017 via Local dblp
Keywords:
* Semi-supervised method
* There is a teacher net and a student net, with identical architecture.
* The teacher makes predictions on unlabeled data, which are used as ground-truth for training the student net.
* After each gradient descent update on the student, the teacher's weights are updated so that it becomes an exponential moving average of the weights of the student at previous timesteps. It's called a "mean teacher" because of this moving average.