Learning to Ground Multi-Agent Communication with Autoencoders on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Learning to Ground Multi-Agent Communication with Autoencoders
Toru Lin and Minyoung Huh and Chris Stauffer and Ser-Nam Lim and Phillip Isola
arXiv e-Print archive - 2021 via Local arXiv
Keywords: cs.LG, cs.AI, cs.CL, cs.MA
more

Summaries/Notes 1

[link] Summary by CodyWild 3 years ago

In certain classes of multi-agent cooperation games, it's useful for agents to be able to coordinate on future actions, which is an obvious use case for having a communication channel between the two players. However, prior work in multi-agent RL has shown that it's surprisingly hard to train agents that (1) consistently learn to use a communication channel in a way that is informative rather than random, and (2) if they do use communication, can come to a common grounding on the meaning of symbols, to use them in an effective way. 

This paper suggests the straightforward and clever approach of, instead of just having agents communicate using arbitrary vectors produced as part of a policy, having those communication vectors be directly linked to the content of an agent's observations. Specifically, this is done by taking the encoding of the image that is used for making policy decisions, and passing that encoding through an autoencoder, using the bottleneck at the middle of the autoencoder as the communication vector sent to other agents. This structure incentivizes the agent to generate communication vectors that are intrinsically grounded in the observation, enforcing a certain level of consistency that the authors hope makes it easier for the other agent to follow and interpret the communication.  

https://i.imgur.com/u9OAZm8.png

Empirically, there seem to be fairly compelling evidence that this autoencoder-based form of grounding is more stable and thus more mutually learnable than learning from RL alone. The authors even found that adding RL training to the autoencoder-based training deteriorated performance.

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private