Counter-fitting Word Vectors to Linguistic Constraints
Nikola Mrkšić
and
Diarmuid Ó Séaghdha
and
Blaise Thomson
and
Milica Gašić
and
Lina Rojas-Barahona
and
Pei-Hao Su
and
David Vandyke
and
Tsung-Hsien Wen
and
Steve Young
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CL, cs.LG
First published: 2016/03/02 (8 years ago) Abstract: In this work, we present a novel counter-fitting method which injects
antonymy and synonymy constraints into vector space representations in order to
improve the vectors' capability for judging semantic similarity. Applying this
method to publicly available pre-trained word vectors leads to a new state of
the art performance on the SimLex-999 dataset. We also show how the method can
be used to tailor the word vector space for the downstream task of dialogue
state tracking, resulting in robust improvements across different dialogue
domains.
They describe a method for augmenting existing word embeddings with knowledge of semantic constraints. The idea is similar to retrofitting by Faruqui et al. (2015), but using additional constraints and a different optimisation function.
https://i.imgur.com/zedR5FV.png
Existing word vectors are further optimised to 1) have high similarity for known synonyms, 2) have low similarity for known antonyms, and 3) have high similarity to words that were highly similar in the original space. They evaluate on SimLex-999, showing state-of-the-art performance. Also, they use the method to improve a dialogue tracking system.