Christian Bonkowski


pdf bib
The MoveOn Motorcycle Speech Corpus
Thomas Winkler | Theodoros Kostoulas | Richard Adderley | Christian Bonkowski | Todor Ganchev | Joachim Köhler | Nikos Fakotakis
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

A speech and noise corpus dealing with the extreme conditions of the motorcycle environment is developed within the MoveOn project. Speech utterances in British English are recorded and processed approaching the issue of command and control and template driven dialog systems on the motorcycle. The major part of the corpus comprises noisy speech and environmental noise recorded on a motorcycle, but several clean speech recordings in a silent environment are also available. The corpus development focuses on distortion free recordings and accurate descriptions of both recorded speech and noise. Not only speech segments are annotated but also annotation of environmental noise is performed. The corpus is a small-sized speech corpus with about 12 hours of clean and noisy speech utterances and about 30 hours of segments with environmental noise without speech. This paper addresses the motivation and development of the speech corpus and finally presents some statistics and results of the database creation.