Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine

John-Jose Nunez, Giuseppe Carenini


Abstract
Pre-trained word embeddings are becoming increasingly popular for natural language processing tasks. This includes medical applications, where embeddings are trained for clinical concepts using specific medical data. Recent work continues to improve on these embeddings. However, no one has yet sought to determine whether these embeddings work as well for one field of medicine as they do in others. In this work, we use intrinsic methods to evaluate embeddings from the various fields of medicine as defined by their ICD-9 systems. We find significant differences between fields, and motivate future work to investigate whether extrinsic tasks will follow a similar pattern.
Anthology ID:
D19-6202
Volume:
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Month:
November
Year:
2019
Address:
Hong Kong
Venues:
EMNLP | Louhi | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11–17
Language:
URL:
https://www.aclweb.org/anthology/D19-6202
DOI:
10.18653/v1/D19-6202
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D19-6202.pdf