Alibaba at IJCNLP-2017 Task 1: Embedding Grammatical Features into LSTMs for Chinese Grammatical Error Diagnosis Task

Yi Yang, Pengjun Xie, Jun Tao, Guangwei Xu, Linlin Li, Luo Si


Abstract
This paper introduces Alibaba NLP team system on IJCNLP 2017 shared task No. 1 Chinese Grammatical Error Diagnosis (CGED). The task is to diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W). We treat the task as a sequence tagging problem and design some handcraft features to solve it. Our system is mainly based on the LSTM-CRF model and 3 ensemble strategies are applied to improve the performance. At the identification level and the position level our system gets the highest F1 scores. At the position level, which is the most difficult level, we perform best on all metrics.
Anthology ID:
I17-4006
Volume:
Proceedings of the IJCNLP 2017, Shared Tasks
Month:
December
Year:
2017
Address:
Taipei, Taiwan
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
41–46
Language:
URL:
https://www.aclweb.org/anthology/I17-4006
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/I17-4006.pdf