A Hybrid System for Chinese Grammatical Error Diagnosis and Correction

Chen Li, Junpei Zhou, Zuyi Bao, Hengyou Liu, Guangwei Xu, Linlin Li


Abstract
This paper introduces the DM_NLP team’s system for NLPTEA 2018 shared task of Chinese Grammatical Error Diagnosis (CGED), which can be used to detect and correct grammatical errors in texts written by Chinese as a Foreign Language (CFL) learners. This task aims at not only detecting four types of grammatical errors including redundant words (R), missing words (M), bad word selection (S) and disordered words (W), but also recommending corrections for errors of M and S types. We proposed a hybrid system including four models for this task with two stages: the detection stage and the correction stage. In the detection stage, we first used a BiLSTM-CRF model to tag potential errors by sequence labeling, along with some handcraft features. Then we designed three Grammatical Error Correction (GEC) models to generate corrections, which could help to tune the detection result. In the correction stage, candidates were generated by the three GEC models and then merged to output the final corrections for M and S types. Our system reached the highest precision in the correction subtask, which was the most challenging part of this shared task, and got top 3 on F1 scores for position detection of errors.
Anthology ID:
W18-3708
Volume:
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venues:
ACL | NLP-TEA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
60–69
Language:
URL:
https://www.aclweb.org/anthology/W18-3708
DOI:
10.18653/v1/W18-3708
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-3708.pdf