Minimally Supervised Number Normalization

Kyle Gorman, Richard Sproat


Abstract
We propose two models for verbalizing numbers, a key component in speech recognition and synthesis systems. The first model uses an end-to-end recurrent neural network. The second model, drawing inspiration from the linguistics literature, uses finite-state transducers constructed with a minimal amount of training data. While both models achieve near-perfect performance, the latter model can be trained using several orders of magnitude less data than the former, making it particularly useful for low-resource languages.
Anthology ID:
Q16-1036
Volume:
Transactions of the Association for Computational Linguistics, Volume 4
Month:
Year:
2016
Address:
Venue:
TACL
SIG:
Publisher:
Note:
Pages:
507–519
Language:
URL:
https://www.aclweb.org/anthology/Q16-1036
DOI:
10.1162/tacl_a_00114
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/Q16-1036.pdf
Video:
 https://vimeo.com/239246509