Neural Domain Adaptation for Biomedical Question Answering

Georg Wiese, Dirk Weissenborn, Mariana Neves


Abstract
Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (ca. 100,000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally too small to train a DL system from scratch. For example, the BioASQ dataset for biomedical QA comprises less then 900 factoid (single answer) and list (multiple answers) QA instances. In this work, we adapt a neural QA system trained on a large open-domain dataset (SQuAD, source) to a biomedical dataset (BioASQ, target) by employing various transfer learning techniques. Our network architecture is based on a state-of-the-art QA system, extended with biomedical word embeddings and a novel mechanism to answer list questions. In contrast to existing biomedical QA systems, our system does not rely on domain-specific ontologies, parsers or entity taggers, which are expensive to create. Despite this fact, our systems achieve state-of-the-art results on factoid questions and competitive results on list questions.
Anthology ID:
K17-1029
Volume:
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
281–289
Language:
URL:
https://www.aclweb.org/anthology/K17-1029
DOI:
10.18653/v1/K17-1029
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/K17-1029.pdf
Poster:
 K17-1029.Poster.pdf