Low-Resource Machine Transliteration Using Recurrent Neural Networks of Asian Languages

Ngoc Tan Le, Fatiha Sadat


Abstract
Grapheme-to-phoneme models are key components in automatic speech recognition and text-to-speech systems. With low-resource language pairs that do not have available and well-developed pronunciation lexicons, grapheme-to-phoneme models are particularly useful. These models are based on initial alignments between grapheme source and phoneme target sequences. Inspired by sequence-to-sequence recurrent neural network-based translation methods, the current research presents an approach that applies an alignment representation for input sequences and pre-trained source and target embeddings to overcome the transliteration problem for a low-resource languages pair. We participated in the NEWS 2018 shared task for the English-Vietnamese transliteration task.
Anthology ID:
W18-2414
Volume:
Proceedings of the Seventh Named Entities Workshop
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venues:
ACL | NEWS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
95–100
Language:
URL:
https://www.aclweb.org/anthology/W18-2414
DOI:
10.18653/v1/W18-2414
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-2414.pdf