Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Jiatao Gu
and
Zhengdong Lu
and
Hang Li
and
Victor O. K. Li
arXiv e-Print archive - 2016 via Local arXiv
Keywords:
cs.CL, cs.AI, cs.LG, cs.NE
First published: 2016/03/21 (8 years ago) Abstract: We address an important problem in sequence-to-sequence (Seq2Seq) learning
referred to as copying, in which certain segments in the input sequence are
selectively replicated in the output sequence. A similar phenomenon is
observable in human language communication. For example, humans tend to repeat
entity names or even long phrases in conversation. The challenge with regard to
copying in Seq2Seq is that new machinery is needed to decide when to perform
the operation. In this paper, we incorporate copying into neural network-based
Seq2Seq learning and propose a new model called CopyNet with encoder-decoder
structure. CopyNet can nicely integrate the regular way of word generation in
the decoder with the new copying mechanism which can choose sub-sequences in
the input sequence and put them at proper places in the output sequence. Our
empirical study on both synthetic data sets and real world data sets
demonstrates the efficacy of CopyNet. For example, CopyNet can outperform
regular RNN-based model with remarkable margins on text summarization tasks.