Basically they observe a pattern they call The Filter Lottery (TFL) where the random seed causes a high variance in the training accuracy:
![](http://i.imgur.com/5rWig0H.png)
They use the convolutional gradient norm ($CGN$) \cite{conf/fgr/LoC015} to determine how much impact a filter has on the overall classification loss function by taking the derivative of the loss function with respect each weight in the filter.
$$CGN(k) = \sum_{i} \left|\frac{\partial L}{\partial w^k_i}\right|$$
They use the CGN to evaluate the impact of a filter on error, and re-initialize filters when the gradient norm of its weights falls below a specific threshold.