Building a Bank of Semantically Encoded Narratives

David K. Elson, Kathleen R. McKeown


Abstract
We propose a methodology for a novel type of discourse annotation whose model is tuned to the analysis of a text as narrative. This is intended to be the basis of a “story bank” resource that would facilitate the automatic analysis of narrative structure and content. The methodology calls for annotators to construct propositions that approximate a reference text, by selecting predicates and arguments from among controlled vocabularies drawn from resources such as WordNet and VerbNet. Annotators then integrate the propositions into a conceptual graph that maps out the entire discourse; the edges represent temporal, causal and other relationships at the level of story content. Because annotators must identify the recurring objects and themes that appear in the text, they also perform coreference resolution and word sense disambiguation as they encode propositions. We describe a collection experiment and a method for determining inter-annotator agreement when multiple annotators encode the same short story. Finally, we describe ongoing work toward extending the method to integrate the annotator’s interpretations of character agency (the goals, plans and beliefs that are relevant, yet not explictly stated in the text).
Anthology ID:
L10-1578
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/835_Paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/835_Paper.pdf