To Aggregate or Not to aggregate: Selective Match Kernels for Image Search on ShortScience.org

dx.doi.org
sci-hub
scholar.google.com

To Aggregate or Not to aggregate: Selective Match Kernels for Image Search
Tolias, Giorgos and Avrithis, Yannis S. and Jégou, Hervé
International Conference on Computer Vision - 2013 via Local Bibsonomy
Keywords: dblp

Summaries/Notes 1

[link] Summary by Evan Su 8 years ago

Descriptors and matching kernel are key components in an image search system. This paper present a framework for matching kernels including non-aggregated kernel such as Hamming Embedding (HE) and aggregated kernel such as Bag-of-Words (BoW) and vector or locally aggregated descriptors (VLAD). To evaluate the effectiveness of aggregation, this paper introduces selective match kernel (SMK) (non-aggregated) and aggregated selective match kernel (ASMK) based on the framework. Experimental results show that ASMK outperforms SMK amd state-of-the-art methods because ASMK can deal with burstiness better than SMK.

Technical details 

The frame work of matching kernel is described by the following general form.

$$K(\mathcal{X},\mathcal{Y}) = \gamma(\mathcal{X})\gamma(\mathcal{Y})
\displaystyle\sum\_{c \in C} w\_c M (\mathcal{X}\_c,\mathcal{Y}\_c)$$

where X and Y are the descriptors of two images, Xc and Yc are a subset of the descriptors that are assigned to a particular visual word, M denotes similarity function, wc is a scalar and gamma denotes normalization factor.

The proposed selective match kernel (SMK) is denoted by

$$M\_N(\mathcal{X}\_c,\mathcal{Y}\_c) = 
\displaystyle\sum\_{x \in \mathcal{X}\_c} 
\displaystyle\sum\_{y \in \mathcal{Y}\_c} 
\sigma (\phi(x)^T\phi(y))$$

Note that #$\mathcal{X}\_c$ times #$\mathcal{Y}\_c$ (# = number of )  matches (dot product) are needed for each visual word.


The proposed aggregated selective match kernel (ASMK) is denoted by

![](http://1.bp.blogspot.com/-ocG18fK0TMM/VS8_MsfOeSI/AAAAAAAAA2k/BrXcXyF0lfM/s1600/ASMK.png)

Note that only one match (dot product) is needed for each visual word.

Results

As shown in Figure 5, ASMK outperform SMK and SMK-BURST. BURST refer to burstiness normalization.

![](http://1.bp.blogspot.com/-PfG69JuQnxk/VS9B6BvrAfI/AAAAAAAAA20/_2eKsEiAdew/s1600/fig5.png)

Table 4 shows that ASMK outperforms state-of-the-art methods.

![](http://2.bp.blogspot.com/-3GtM-bH6C2c/VS9CRKqN3TI/AAAAAAAAA28/Grzn5QT_GhM/s1600/table%2B4.png)

Note that all the results above are from the initial result set. Re-ranking approaches are not included.

Your comment: