Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Ioffe, Sergey and Szegedy, Christian
International Conference on Machine Learning - 2015 via Local Bibsonomy
Keywords: dblp

Summary by Denny Britz 7 years ago
Could you please explain why adding the parameters $\beta$ and $\gamma$ does not change the variance?

What do you mean by "shuffle training examples more thoroughly"?

