In a GP regression model, the process outputs can be integrated over analytically, but this is not so for (a) inputs and (b) kernel hyperparameters. Titsias etal 2010 showed a very clever way to do (a) with a particular variational technique (the goal was to do density estimation). In this paper, (b) is tackled, which requires some nontrivial extensions of Titsias etal. In particular, they show how to decouple the GP prior from the kernel hyperparameters. This is a simple trick, but very effective for what they want to do. They also treat the large number of kernel hyperparameters with an additional level of ARD and show how the ARD hyperparameters can be solved for analytically, which is nice.