Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification on ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Yue Yang and Artemis Panagopoulou and Shenghao Zhou and Daniel Jin and Chris Callison-Burch and Mark Yatskar
arXiv e-Print archive - 2022 via Local arXiv
Keywords: cs.CV, cs.CL
more

Summaries/Notes 1

[link] Summary by ngthanhtinqn 2 years ago

what is the paper doing?
This paper proposed a way to explain the model decision by human-readable concepts. For example, if the model thinks the following image is a black-throated sparrow, then a human can understand this decision via input descriptors.

https://i.imgur.com/xVleDhp.png

The descriptors were obtained from GPT-3, they got 500 descriptors for each class and then remove the class name in each descriptor. Then, for each class, they chose $k$ concepts to make sure that every class has an equal amount of concepts.

After that, they put these concepts into a concept selection module to select a more fine-grained subset of concepts for each class.

Then, they put these concepts and the image into CLIP to learn the score for each concept.

Finally, they put a Class-concept weight matrix on top of CLIP to fine-tune these scores and output the predicted class name. Note that, this weight matrix was initialized with language priors.

https://i.imgur.com/r9Op5Lm.png

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private