Automated Acquisition of Patterns for Coding Political Event Data: Two Case Studies

Peter Makarov


Abstract
We present a simple approach to the generation and labeling of extraction patterns for coding political event data, an important task in computational social science. We use weak supervision to identify pattern candidates and learn distributed representations for them. Given seed extraction patterns from existing pattern dictionaries, we use label propagation to label pattern candidates. We present two case studies. i) We derive patterns of acceptable quality for a number of international relations & conflicts categories using pattern candidates of O’Connor et al (2013). ii) We derive patterns for coding protest events that outperform an established set of Tabari / Petrarch hand-crafted patterns.
Anthology ID:
W18-4512
Volume:
Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico
Venues:
COLING | LaTeCH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
103–112
Language:
URL:
https://www.aclweb.org/anthology/W18-4512
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-4512.pdf