Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita and David Talbot and Fedor Moiseev and Rico Sennrich and Ivan Titov
arXiv e-Print archive - 2019 via Local arXiv
Keywords: cs.CL


Summary by CodyWild 5 years ago
Your comment: allows researchers to publish paper summaries that are voted on and ranked!

Sponsored by: