Complex Word Identification: Convolutional Neural Network vs. Feature Engineering

Segun Taofeek Aroyehun, Jason Angel, Daniel Alejandro Pérez Alvarez, Alexander Gelbukh


Abstract
We describe the systems of NLP-CIC team that participated in the Complex Word Identification (CWI) 2018 shared task. The shared task aimed to benchmark approaches for identifying complex words in English and other languages from the perspective of non-native speakers. Our goal is to compare two approaches: feature engineering and a deep neural network. Both approaches achieved comparable performance on the English test set. We demonstrated the flexibility of the deep-learning approach by using the same deep neural network setup in the Spanish track. Our systems achieved competitive results: all our systems were within 0.01 of the system with the best macro-F1 score on the test sets except on Wikipedia test set, on which our best system is 0.04 below the best macro-F1 score.
Anthology ID:
W18-0538
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venues:
BEA | NAACL | WS
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
322–327
Language:
URL:
https://www.aclweb.org/anthology/W18-0538
DOI:
10.18653/v1/W18-0538
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-0538.pdf