A Regularized Framework for Sparse and Structured Neural Attention. on ShortScience.org

papers.nips.cc
scholar.google.com

A Regularized Framework for Sparse and Structured Neural Attention.
Vlad Niculae and Mathieu Blondel
Neural Information Processing Systems Conference - 2017 via Local dblp
Keywords:

Summaries/Notes 1

[link] Summary by Joseph Paul Cohen 7 years ago

The idea in this paper is to develop a version of attention that will incorporate similarity in neighboring bins. This aligned with the work \cite{conf/icml/BeckhamP17} which presented a different approach to deal with consistency between classes of predictions.

In this work the closed form softmax function is replaced by a small optimization problem with this regularizer:

$$ +\lambda \sum_{i=1}^{d-1} |y_{i+1}-y_i|$$

Because of this, many of the neighboring probabilities are exactly the same resulting in attention that can be seen as blocks.

https://i.imgur.com/oue0x4V.png

Poster:
https://i.imgur.com/gclMjzR.png

Your comment:

Write your summary here (You can use $\LaTeX$ and markdown syntax):

Anon Private