A Simpler and More Generalizable Story Detector using Verb and Character Features

Joshua Eisenberg, Mark Finlayson


Abstract
Story detection is the task of determining whether or not a unit of text contains a story. Prior approaches achieved a maximum performance of 0.66 F1, and did not generalize well across different corpora. We present a new state-of-the-art detector that achieves a maximum performance of 0.75 F1 (a 14% improvement), with significantly greater generalizability than previous work. In particular, our detector achieves performance above 0.70 F1 across a variety of combinations of lexically different corpora for training and testing, as well as dramatic improvements (up to 4,000%) in performance when trained on a small, disfluent data set. The new detector uses two basic types of features–ones related to events, and ones related to characters–totaling 283 specific features overall; previous detectors used tens of thousands of features, and so this detector represents a significant simplification along with increased performance.
Anthology ID:
D17-1287
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2708–2715
Language:
URL:
https://www.aclweb.org/anthology/D17-1287
DOI:
10.18653/v1/D17-1287
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D17-1287.pdf