The authors present a robust low rank kernel embedding related to higher order tensors and
latent variable models. In general the work is interesting and promising. It provides synergies between machine learning, kernel methods, tensors and latent variable models.
The RKHS embedding of a joint probability distribution between two variables involves the notion of covariance operators. For joint distributions over multiple variables, a tensor operator is needed. The paper defines these objects together with appropriate inner product, norms and reshaping operations on them. The paper then notes that in the presence of latent variables where the conditional dependence structure is a tree, these operators are low-rank when reshaped along the edges connecting latent variables. A low-rank decomposition of the embedding is then proposed that can be implemented on Gram matrices. Empirical results on density estimation tasks are impressive.