Object-Centric Learning with Slot Attention
Francesco Locatello
and
Dirk Weissenborn
and
Thomas Unterthiner
and
Aravindh Mahendran
and
Georg Heigold
and
Jakob Uszkoreit
and
Alexey Dosovitskiy
and
Thomas Kipf
arXiv e-Print archive - 2020 via Local arXiv
Keywords:
cs.LG, cs.CV, stat.ML
First published: 2024/10/15 (just now) Abstract: Learning object-centric representations of complex scenes is a promising step
towards enabling efficient abstract reasoning from low-level perceptual
features. Yet, most deep learning approaches learn distributed representations
that do not capture the compositional properties of natural scenes. In this
paper, we present the Slot Attention module, an architectural component that
interfaces with perceptual representations such as the output of a
convolutional neural network and produces a set of task-dependent abstract
representations which we call slots. These slots are exchangeable and can bind
to any object in the input by specializing through a competitive procedure over
multiple rounds of attention. We empirically demonstrate that Slot Attention
can extract object-centric representations that enable generalization to unseen
compositions when trained on unsupervised object discovery and supervised
property prediction tasks.
The Slot Attention module maps from a set of N input feature vectors to a set of K
output vectors that we refer to as slots. Each vector in this output set can, for example, describe
an object or an entity in the input.
https://i.imgur.com/81nh508.png