SoftTarget Regularization: An Effective Technique to Reduce Over-Fitting in Neural Networks
Armen Aghajanyan
arXiv e-Print archive - 2016 via Local arXiv
Keywords: cs.LG


Summary by felipe 7 years ago
Can you expand on "target labels are augmented with a exponential mean of output labels"? Why is the moving average effective? What impact does it have on training?

Thanks for your comment, I detailed how the training labels are changed. However, the paper does not show why explicitly the moving average is effective. I also could not find the effect on the training process itself, just on the training result (lower test loss).

Thanks! That is really clear now. So I think SoftTarget is making it harder for a weight update to make drastic changes to the output. So only persistent changes to the output are adopted. The output changing slowly will cause the loss and then gradient to slowly shrink and grow. This feels like a momentum that takes into account the individual outputs of the loss.

Your comment: allows researchers to publish paper summaries that are voted on and ranked!

Sponsored by: