Stacking With Auxiliary Features for Entity Linking in the Medical Domain

Nazneen Fatema Rajani, Mihaela Bornea, Ken Barker


Abstract
Linking spans of natural language text to concepts in a structured source is an important task for many problems. It allows intelligent systems to leverage rich knowledge available in those sources (such as concept properties and relations) to enhance the semantics of the mentions of these concepts in text. In the medical domain, it is common to link text spans to medical concepts in large, curated knowledge repositories such as the Unified Medical Language System. Different approaches have different strengths: some are precision-oriented, some recall-oriented; some better at considering context but more prone to hallucination. The variety of techniques suggests that ensembling could outperform component technologies at this task. In this paper, we describe our process for building a Stacking ensemble using additional, auxiliary features for Entity Linking in the medical domain. We report experiments that show that naive ensembling does not always outperform component Entity Linking systems, that stacking usually outperforms naive ensembling, and that auxiliary features added to the stacker further improve its performance on three distinct datasets. Our best model produces state-of-the-art results on several medical datasets.
Anthology ID:
W17-2305
Volume:
BioNLP 2017
Month:
August
Year:
2017
Address:
Vancouver, Canada,
Venues:
BioNLP | WS
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
39–47
Language:
URL:
https://www.aclweb.org/anthology/W17-2305
DOI:
10.18653/v1/W17-2305
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-2305.pdf