A Corpus-based Approach for Spanish-Chinese Language Learning

Shuyuan Cao, Iria da Cunha, Mikel Iruskieta


Abstract
Due to the huge population that speaks Spanish and Chinese, these languages occupy an important position in the language learning studies. Although there are some automatic translation systems that benefit the learning of both languages, there is enough space to create resources in order to help language learners. As a quick and effective resource that can give large amount language information, corpus-based learning is becoming more and more popular. In this paper we enrich a Spanish-Chinese parallel corpus automatically with part of-speech (POS) information and manually with discourse segmentation (following the Rhetorical Structure Theory (RST) (Mann and Thompson, 1988)). Two search tools allow the Spanish-Chinese language learners to carry out different queries based on tokens and lemmas. The parallel corpus and the research tools are available to the academic community. We propose some examples to illustrate how learners can use the corpus to learn Spanish and Chinese.
Anthology ID:
W16-4913
Volume:
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
NLP-TEA | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
97–106
Language:
URL:
https://www.aclweb.org/anthology/W16-4913
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W16-4913.pdf