Recursive Neural Networks Can Learn Logical Semantics
Samuel R. Bowman
and
Christopher Potts
and
Christopher D. Manning
arXiv e-Print archive - 2014 via Local arXiv
Keywords:
cs.CL, cs.LG, cs.NE
First published: 2014/06/06 (10 years ago) Abstract: Tree-structured recursive neural networks (TreeRNNs) for sentence meaning
have been successful for many applications, but it remains an open question
whether the fixed-length representations that they learn can support tasks as
demanding as logical deduction. We pursue this question by evaluating whether
two such models---plain TreeRNNs and tree-structured neural tensor networks
(TreeRNTNs)---can correctly learn to identify logical relationships such as
entailment and contradiction using these representations. In our first set of
experiments, we generate artificial data from a logical grammar and use it to
evaluate the models' ability to learn to handle basic relational reasoning,
recursive structures, and quantification. We then evaluate the models on the
more natural SICK challenge data. Both models perform competitively on the SICK
data and generalize well in all three experiments on simulated data, suggesting
that they can learn suitable representations for logical inference in natural
language.
The paper wished to see whether distributed word representations could be used in a machine learning setting to achieve good performance in the task of natural language inference (also called recognizing textual entailment).
## Summary
Paper investigates the use of two neural architectures for identifying the entailment and contradiction logical relationships. In particular, the paper uses tree-based recursive neural networks and tree-based recursive neural tensor networks relying on the notion of compositionality to codify natural language word order and semantic meaning. The models are first tested on a reduced artificial data set organized around a small boolean structure world model where the models are tasked with learning the propositional relationships. Afterwards these models are tested on another more complex artificial data set where simple propositions are converted into more complex formulas. Both models achieved solid performance on these datasets although in the latter data set the RNTN seemed to struggle when tested on larger-length expressions. The models were also tested on an artificial dataset where they were tasked with learning how to correctly interpret various quantifiers and negation in the context of natural logics. Finally, the models were tested on a freely available textual entailment dataset called SICK (that was supplemented with data from the Denotation Graph project). The models achieved reasonably good performance on the SICK challenge, showing that they have the potential to accurately learn distributed representations on noisy real-world data.
## Future Work
The neural models proposed seem to show particular promise for achieving good performance on very natural language logical semantics tasks. It is firmly believed that given enough data, the neural architectures proposed have the potential to perform even better on the proposed task. This makes acquisition of a more comprehensive and diverse dataset a natural next step in pursuing this modeling approach. Further even the more powerful RNTN seems to show rapidly declining performance on larger expressions which leaves the question of whether stronger models or learning techniques can be used to improve performance on considerably-sized expressions. In addition, there is still the question as to how these architectures actually encode the natural logics they are being asked to learn.