Context-Aware Automatic Text Simplification of Health Materials in Low-Resource Domains

Tarek Sakakini, Jong Yoon Lee, Aditya Duri, Renato F.L. Azevedo, Victor Sadauskas, Kuangxiao Gu, Suma Bhat, Dan Morrow, James Graumlich, Saqib Walayat, Mark Hasegawa-Johnson, Thomas Huang, Ann Willemsen-Dunlap, Donald Halpin


Abstract
Healthcare systems have increased patients’ exposure to their own health materials to enhance patients’ health levels, but this has been impeded by patients’ lack of understanding of their health material. We address potential barriers to their comprehension by developing a context-aware text simplification system for health material. Given the scarcity of annotated parallel corpora in healthcare domains, we design our system to be independent of a parallel corpus, complementing the availability of data-driven neural methods when such corpora are available. Our system compensates for the lack of direct supervision using a biomedical lexical database: Unified Medical Language System (UMLS). Compared to a competitive prior approach that uses a tool for identifying biomedical concepts and a consumer-directed vocabulary list, we empirically show the enhanced accuracy of our system due to improved handling of ambiguous terms. We also show the enhanced accuracy of our system over directly-supervised neural methods in this low-resource setting. Finally, we show the direct impact of our system on laypeople’s comprehension of health material via a human subjects’ study (n=160).
Anthology ID:
2020.louhi-1.13
Volume:
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
Month:
November
Year:
2020
Address:
Online
Venues:
EMNLP | Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
115–126
Language:
URL:
https://www.aclweb.org/anthology/2020.louhi-1.13
DOI:
10.18653/v1/2020.louhi-1.13
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.louhi-1.13.pdf