Diversity-Driven Combination for Grammatical Error Correction
Wenjuan Han
and
Hwee Tou Ng
arXiv e-Print archive - 2021 via Local arXiv
Keywords:
cs.CL
First published: 2024/12/22 (just now) Abstract: Grammatical error correction (GEC) is the task of detecting and correcting
errors in a written text. The idea of combining multiple system outputs has
been successfully used in GEC. To achieve successful system combination,
multiple component systems need to produce corrected sentences that are both
diverse and of comparable quality. However, most existing state-of-the-art GEC
approaches are based on similar sequence-to-sequence neural networks, so the
gains are limited from combining the outputs of component systems similar to
one another. In this paper, we present Diversity-Driven Combination (DDC) for
GEC, a system combination strategy that encourages diversity among component
systems. We evaluate our system combination strategy on the CoNLL-2014 shared
task and the BEA-2019 shared task. On both benchmarks, DDC achieves significant
performance gain with a small number of training examples and outperforms the
component systems by a large margin. Our source code is available at
https://github.com/nusnlp/gec-ddc.