Low-Resource G2P and P2G Conversion with Synthetic Training Data

Bradley Hauer, Amir Ahmad Habibi, Yixing Luan, Arnob Mallik, Grzegorz Kondrak


Abstract
This paper presents the University of Alberta systems and results in the SIGMORPHON 2020 Task 1: Multilingual Grapheme-to-Phoneme Conversion. Following previous SIGMORPHON shared tasks, we define a low-resource setting with 100 training instances. We experiment with three transduction approaches in both standard and low-resource settings, as well as on the related task of phoneme-to-grapheme conversion. We propose a method for synthesizing training data using a combination of diverse models.
Anthology ID:
2020.sigmorphon-1.12
Volume:
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
Month:
July
Year:
2020
Address:
Online
Venues:
ACL | SIGMORPHON | WS
SIG:
SIGMORPHON
Publisher:
Association for Computational Linguistics
Note:
Pages:
117–122
Language:
URL:
https://www.aclweb.org/anthology/2020.sigmorphon-1.12
DOI:
10.18653/v1/2020.sigmorphon-1.12
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.sigmorphon-1.12.pdf