Dynamic Sentence Boundary Detection for Simultaneous Translation

Ruiqing Zhang, Chuanqiang Zhang


Abstract
Simultaneous Translation is a great challenge in which translation starts before the source sentence finished. Most studies take transcription as input and focus on balancing translation quality and latency for each sentence. However, most ASR systems can not provide accurate sentence boundaries in realtime. Thus it is a key problem to segment sentences for the word streaming before translation. In this paper, we propose a novel method for sentence boundary detection that takes it as a multi-class classification task under the end-to-end pre-training framework. Experiments show significant improvements both in terms of translation quality and latency.
Anthology ID:
2020.autosimtrans-1.1
Volume:
Proceedings of the First Workshop on Automatic Simultaneous Translation
Month:
July
Year:
2020
Address:
Seattle, Washington
Venues:
ACL | AutoSimTrans | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–9
Language:
URL:
https://www.aclweb.org/anthology/2020.autosimtrans-1.1
DOI:
10.18653/v1/2020.autosimtrans-1.1
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.autosimtrans-1.1.pdf
Video:
 http://slideslive.com/38929917