Embodied Question Answering
Abhishek Das
and
Samyak Datta
and
Georgia Gkioxari
and
Stefan Lee
and
Devi Parikh
and
Dhruv Batra
arXiv e-Print archive - 2017 via Local arXiv
Keywords:
cs.CV, cs.AI, cs.CL, cs.LG
First published: 2017/11/30 (6 years ago) Abstract: We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where
an agent is spawned at a random location in a 3D environment and asked a
question ("What color is the car?"). In order to answer, the agent must first
intelligently navigate to explore the environment, gather information through
first-person (egocentric) vision, and then answer the question ("orange").
This challenging task requires a range of AI skills -- active perception,
language understanding, goal-driven navigation, commonsense reasoning, and
grounding of language into actions. In this work, we develop the environments,
end-to-end-trained reinforcement learning agents, and evaluation protocols for
EmbodiedQA.