They build a system for detecting metaphors (“blind alley”, “honest meal”, etc) from literal word pairs.
https://i.imgur.com/Bv1gIb2.png
Annotated metaphor examples from Tsvetkov et al. (2014), used in this work.
The basic system uses word embedding similarity – cosine between the word embeddings. Then they explore variations using phrase embeddings, cos(phrase-word2, word2), which is similar to the operations with word regularities by Mikolov.
Finally, they create vector representations for words and phrases using visual information. The words are used as queries in Google Image Search, and the returned images are passed through an image detection network in order to obtain vector representations. The best final system performs the task separately using linguistic and visual vectors, and then combines the resulting scores.