Deep Learning for Punctuation Restoration in Medical Reports

Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft


Abstract
In clinical dictation, speakers try to be as concise as possible to save time, often resulting in utterances without explicit punctuation commands. Since the end product of a dictated report, e.g. an out-patient letter, does require correct orthography, including exact punctuation, the latter need to be restored, preferably by automated means. This paper describes a method for punctuation restoration based on a state-of-the-art stack of NLP and machine learning techniques including B-RNNs with an attention mechanism and late fusion, as well as a feature extraction technique tailored to the processing of medical terminology using a novel vocabulary reduction model. To the best of our knowledge, the resulting performance is superior to that reported in prior art on similar tasks.
Anthology ID:
W17-2319
Volume:
BioNLP 2017
Month:
August
Year:
2017
Address:
Vancouver, Canada,
Venues:
BioNLP | WS
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
159–164
Language:
URL:
https://www.aclweb.org/anthology/W17-2319
DOI:
10.18653/v1/W17-2319
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-2319.pdf