Document Representation Learning for Patient History Visualization

Halid Ziya Yerebakan, Yoshihisa Shinagawa, Parmeet Bhatia, Yiqiang Zhan


Abstract
We tackle the problem of generating a diagrammatic summary of a set of documents each of which pertains to loosely related topics. In particular, we aim at visualizing the medical histories of patients. In medicine, choosing relevant reports from a patient’s past exams for comparison provide valuable information for precise treatment planning. Manually finding the relevant reports for comparison studies from a large database is time-consuming, which could result overlooking of some critical information. This task can be automated by defining similarity among documents which is a nontrivial task since these documents are often stored in an unstructured text format. To facilitate this, we have used a representation learning algorithm that creates a semantic representation space for documents where the clinically related documents lie close to each other. We have utilized referral information to weakly supervise a LSTM network to learn this semantic space. The abstract representations within this semantic space are not only useful to visualize disease progressions corresponding to the relevant report groups of a patient, but are also beneficial to analyze diseases at the population level. The proposed key tool here is clustering of documents based on the document similarity whose metric is learned from corpora.
Anthology ID:
C18-2007
Volume:
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–33
Language:
URL:
https://www.aclweb.org/anthology/C18-2007
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/C18-2007.pdf