Amharic-English Speech Translation in Tourism Domain

Michael Melese, Laurent Besacier, Million Meshesha


Abstract
This paper describes speech translation from Amharic-to-English, particularly Automatic Speech Recognition (ASR) with post-editing feature and Amharic-English Statistical Machine Translation (SMT). ASR experiment is conducted using morpheme language model (LM) and phoneme acoustic model(AM). Likewise,SMT conducted using word and morpheme as unit. Morpheme based translation shows a 6.29 BLEU score at a 76.4% of recognition accuracy while word based translation shows a 12.83 BLEU score using 77.4% word recognition accuracy. Further, after post-edit on Amharic ASR using corpus based n-gram, the word recognition accuracy increased by 1.42%. Since post-edit approach reduces error propagation, the word based translation accuracy improved by 0.25 (1.95%) BLEU score. We are now working towards further improving propagated errors through different algorithms at each unit of speech translation cascading component.
Anthology ID:
W17-4608
Volume:
Proceedings of the Workshop on Speech-Centric Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
59–66
Language:
URL:
https://www.aclweb.org/anthology/W17-4608
DOI:
10.18653/v1/W17-4608
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-4608.pdf