Cross-Lingual Content Scoring
Andrea Horbach, Sebastian Stennmanns, Torsten Zesch
Abstract
We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an automatic scoring task are from two different languages. Cross-lingual scoring can contribute to educational equality by allowing answers in multiple languages. Training a model in one language and applying it to another language might also help to overcome data sparsity issues by re-using trained models from other languages. As there is no suitable dataset available for this new task, we create a comparable bi-lingual corpus by extending the English ASAP dataset with German answers. Our experiments with cross-lingual scoring based on machine-translating either training or test data show a considerable drop in scoring quality.- Anthology ID:
- W18-0550
- Volume:
- Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venues:
- BEA | NAACL | WS
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 410–419
- Language:
- URL:
- https://www.aclweb.org/anthology/W18-0550
- DOI:
- 10.18653/v1/W18-0550
- PDF:
- http://aclanthology.lst.uni-saarland.de/W18-0550.pdf