Fine Grained Citation Span for References in Wikipedia

Besnik Fetahu, Katja Markert, Avishek Anand


Abstract
Verifiability is one of the core editing principles in Wikipedia, where editors are encouraged to provide citations for the added content. For a Wikipedia article determining what content is covered by a citation or the citation span is not trivial, an important aspect for automated citation finding for uncovered content, or fact assessments. We address the problem of determining the citation span in Wikipedia articles. We approach this problem by classifying which textual fragments in an article are covered or hold true given a citation. We propose a sequence classification approach where for a paragraph and a citation, we determine the citation span at a fine-grained level. We provide a thorough experimental evaluation and compare our approach against baselines adopted from the scientific domain, where we show improvement for all evaluation metrics.
Anthology ID:
D17-1212
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1990–1999
Language:
URL:
https://www.aclweb.org/anthology/D17-1212
DOI:
10.18653/v1/D17-1212
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D17-1212.pdf
Video:
 https://vimeo.com/238233167