Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering

Hai-Long Trieu, Anh-Khoa Duong Nguyen, Nhung Nguyen, Makoto Miwa, Hiroya Takamura, Sophia Ananiadou


Abstract
This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.
Anthology ID:
D19-5727
Volume:
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks
Month:
November
Year:
2019
Address:
Hong Kong, China
Venues:
BioNLP | EMNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
196–205
Language:
URL:
https://www.aclweb.org/anthology/D19-5727
DOI:
10.18653/v1/D19-5727
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D19-5727.pdf