Language Model Based Grammatical Error Correction without Annotated Training Data

Christopher Bryant, Ted Briscoe


Abstract
Since the end of the CoNLL-2014 shared task on grammatical error correction (GEC), research into language model (LM) based approaches to GEC has largely stagnated. In this paper, we re-examine LMs in GEC and show that it is entirely possible to build a simple system that not only requires minimal annotated data (∼1000 sentences), but is also fairly competitive with several state-of-the-art systems. This approach should be of particular interest for languages where very little annotated training data exists, although we also hope to use it as a baseline to motivate future research.
Anthology ID:
W18-0529
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venues:
BEA | NAACL | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
247–253
Language:
URL:
https://www.aclweb.org/anthology/W18-0529
DOI:
10.18653/v1/W18-0529
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-0529.pdf