The authors build their work on top of the generalized conditional gradient (GCG) method for sparse optimization. In particular, GCG methods require computation of the polar operator for the sparse regularization function (an example is the dual norm if the regularization function is an atomic norm). In this work, the authors identify a class of regularization functions, which are based on an underlying subset cost function. The key idea is a to 'lift' the regularizer into a higher dimensional space together with some constraints in the higher-dimensional space, where it has the property of 'marginalized modularity' allowing it to be reformulated as a linear program. Finally, the approach is generalized to general proximal objectives. The results demonstrate that the method is able to achieve better objective values in much less CPU time when compared with another polar operator method and accelerated proximal gradient (APG) on group Lasso and path coding problems.