FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations
Lingjie Mei and Jiayuan Mao and Ziqi Wang and Chuang Gan and Joshua B. Tenenbaum
arXiv e-Print archive - 2022 via Local arXiv
Keywords: cs.CV, cs.AI, cs.CL, cs.LG
more

Summaries/Notes 1

[link] Summary by ngthanhtinqn 2 years ago

This paper proposed a method to locate an object based on an image and a sentence describing objects in the image. Then, predicting a new visual concept embedding based on two graphs (1) a graph that describes the relationship between objects in a supplemental sentence describing several objects, and (2) a graph that describes the relationship between the detected object in the image and example images related to objects in the supplementary sentence. This embedding can be used for many downstream tasks such as Visual entailment, Visual Reasoning.

For example, in the domain 1 image, the new visual concept is red, and the model can locate where is the red cub in the image (1a). Then, in (1b) the model can interpret the supplemental sentences that relate the novel concept with other concepts.

https://i.imgur.com/yBIteYT.png

To locate the box that describes the object, this work utilized MaskRCNN to first detect the object in the scene, then used a Neuro-Symbolic Program to match the object mentioned in the input sentence with the detected objects by MaskRCNN.

https://i.imgur.com/2cG9IUX.png

To learn the concept embedding for that object, this work needs a supplemental sentence that describes several objects, they are known concepts except one is a novel concept. Then, building two graphs $GNN_{concept}$, and $GNN_{example}$.

In $GNN_{concept}$, this is a graph representing the relationship between known concepts and the novel concept.
For example in this graph, "White-eyed Vireo" is a new concept.
https://i.imgur.com/1LjirJz.png

In $GNN_{example}$, this is a graph representing the relationship between the detected object that represents the novel object and example images of the novel object.
https://i.imgur.com/SdR74Vu.png

Then learn the concept embedding for this novel concept.

https://i.imgur.com/YGYEPvc.png

https://i.imgur.com/VXOBn6n.png

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private