Generating Task-Pertinent sorted Error Lists for Speech Recognition

Olivier Galibert, Mohamed Ameur Ben Jannet, Juliette Kahn, Sophie Rosset


Abstract
Automatic Speech recognition (ASR) is one of the most widely used components in spoken language processing applications. ASR errors are of varying importance with respect to the application, making error analysis keys to improving speech processing applications. Knowing the most serious errors for the applicative case is critical to build better systems. In the context of Automatic Speech Recognition (ASR) used as a first step towards Named Entity Recognition (NER) in speech, error seriousness is usually determined by their frequency, due to the use of the WER as metric to evaluate the ASR output, despite the emergence of more relevant measures in the literature. We propose to use a different evaluation metric form the literature in order to classify ASR errors according to their seriousness for NER. Our results show that the ASR errors importance is ranked differently depending on the used evaluation metric. A more detailed analysis shows that the estimation of the error impact given by the ATENE metric is more adapted to the NER task than the estimation based only on the most used frequency metric WER.
Anthology ID:
L16-1297
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1883–1889
Language:
URL:
https://www.aclweb.org/anthology/L16-1297
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/L16-1297.pdf