StopWasting My Gradients: Practical SVRG
Harikandeh, Reza
and
Ahmed, Mohamed Osama
and
Virani, Alim
and
Schmidt, Mark
and
Konecný, Jakub
and
Sallinen, Scott
Neural Information Processing Systems Conference - 2015 via Local Bibsonomy
Keywords:
dblp
This paper extends the stochastic optimization algorithm SVRG proposed in recent years. These modifications mainly includes: the convergence analysis of SVRG with corrupted full gradient; Mix the iteration of SGD and SVRG; the strategy of mini-batch; Using support vectors etc. For each modification, the author makes clear proofs and achieves linear convergence under smooth and strong convex assumptions. However, this paper's novelty is not big enough. The improvement of convergence rate is not obvious and the proof outline is very similar to the original SVRG. The key problem such as the support for non-strongly convex loss is still unsolved.
This paper starts with a key proposition showing that SVRG does not require a very accurate approximation of the total gradient of the objective function needed by SVRG algorithm. The authors use this proposition to derive a batching SVRG algorithm with the same convergence rate as that of original SVRG. Then, the authors propose a mixed stochastic gradient/SVRG approach and give a convergence proof for such a scheme. As a different approach of speeding up, the authors proposed a speed-up technique for Huberized hinge-loss support vector machine.