Extracting Semantic Aspects for Structured Representation of Clinical Trial Eligibility Criteria

Tirthankar Dasgupta, Ishani Mondal, Abir Naskar, Lipika Dey


Abstract
Eligibility criteria in the clinical trials specify the characteristics that a patient must or must not possess in order to be treated according to a standard clinical care guideline. As the process of manual eligibility determination is time-consuming, automatic structuring of the eligibility criteria into various semantic categories or aspects is the need of the hour. Existing methods use hand-crafted rules and feature-based statistical machine learning methods to dynamically induce semantic aspects. However, in order to deal with paucity of aspect-annotated clinical trials data, we propose a novel weakly-supervised co-training based method which can exploit a large pool of unlabeled criteria sentences to augment the limited supervised training data, and consequently enhance the performance. Experiments with 0.2M criteria sentences show that the proposed approach outperforms the competitive supervised baselines by 12% in terms of micro-averaged F1 score for all the aspects. Probing deeper into analysis, we observe domain-specific information boosts up the performance by a significant margin.
Anthology ID:
2020.clinicalnlp-1.27
Volume:
Proceedings of the 3rd Clinical Natural Language Processing Workshop
Month:
November
Year:
2020
Address:
Online
Venues:
ClinicalNLP | EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
243–248
Language:
URL:
https://www.aclweb.org/anthology/2020.clinicalnlp-1.27
DOI:
10.18653/v1/2020.clinicalnlp-1.27
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.clinicalnlp-1.27.pdf