A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
Yoshua Bengio
and
Tristan Deleu
and
Nasim Rahaman
and
Rosemary Ke
and
Sébastien Lachapelle
and
Olexa Bilaniuk
and
Anirudh Goyal
and
Christopher Pal
arXiv e-Print archive - 2019 via Local arXiv
Keywords:
cs.LG, stat.ML
First published: 2019/01/30 (5 years ago) Abstract: We propose to meta-learn causal structures based on how fast a learner adapts
to new distributions arising from sparse distributional changes, e.g. due to
interventions, actions of agents and other sources of non-stationarities. We
show that under this assumption, the correct causal structural choices lead to
faster adaptation to modified distributions because the changes are
concentrated in one or just a few mechanisms when the learned knowledge is
modularized appropriately. This leads to sparse expected gradients and a lower
effective number of degrees of freedom needing to be relearned while adapting
to the change. It motivates using the speed of adaptation to a modified
distribution as a meta-learning objective. We demonstrate how this can be used
to determine the cause-effect relationship between two observed variables. The
distributional changes do not need to correspond to standard interventions
(clamping a variable), and the learner has no direct knowledge of these
interventions. We show that causal structures can be parameterized via
continuous variables and learned end-to-end. We then explore how these ideas
could be used to also learn an encoder that would map low-level observed
variables to unobserved causal variables leading to faster adaptation
out-of-distribution, learning a representation space where one can satisfy the
assumptions of independent mechanisms and of small and sparse changes in these
mechanisms due to actions and non-stationarities.