A Joint Model of Language and Perception for Grounded Attribute Learning
Cynthia Matuszek
and
Nicholas FitzGerald
and
Luke Zettlemoyer
and
Liefeng Bo
and
Dieter Fox
arXiv e-Print archive - 2012 via Local arXiv
Keywords:
cs.CL, cs.LG, cs.RO
First published: 2012/06/27 (12 years ago) Abstract: As robots become more ubiquitous and capable, it becomes ever more important
to enable untrained users to easily interact with them. Recently, this has led
to study of the language grounding problem, where the goal is to extract
representations of the meanings of natural language tied to perception and
actuation in the physical world. In this paper, we present an approach for
joint learning of language and perception models for grounded attribute
induction. Our perception model includes attribute classifiers, for example to
detect object color and shape, and the language model is based on a
probabilistic categorial grammar that enables the construction of rich,
compositional meaning representations. The approach is evaluated on the task of
interpreting sentences that describe sets of objects in a physical workspace.
We demonstrate accurate task performance and effective latent-variable concept
induction in physical grounded scenes.
* Task of extracting representations of language tied to physical world
* New grounded concepts from a set of scenes containing only sentences, images, and indications of what objects being referred to
* System includes:
* *Semantic parsing model*
* Defines distribution over logical meaning representations for each given sentence
* Set of visual attribute classifiers for each possible object in scene
* Joint model learning mapping from logical constants in logical form to set of visual attribute classifiers
* Extracted depth and RGB values from images as features (shape and color attributes)