Masking Actor Information Leads to Fairer Political Claims Detection
Erenay Dayanik | Sebastian Padó
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

A central concern in Computational Social Sciences (CSS) is fairness: where the role of NLP is to scale up text analysis to large corpora, the quality of automatic analyses should be as independent as possible of textual properties. We analyze the performance of a state-of-the-art neural model on the task of political claims detection (i.e., the identification of forward-looking statements made by political actors) and identify a strong frequency bias: claims made by frequent actors are recognized better. We propose two simple debiasing methods which mask proper names and pronouns during training of the model, thus removing personal information bias. We find that (a) these methods significantly decrease frequency bias while keeping the overall performance stable; and (b) the resulting models improve when evaluated in an out-of-domain setting.

DEbateNet-mig15:Tracing the 2015 Immigration Debate in Germany Over Time
Gabriella Lapesa | Andre Blessing | Nico Blokker | Erenay Dayanik | Sebastian Haunss | Jonas Kuhn | Sebastian Padó
Proceedings of the 12th Language Resources and Evaluation Conference

DEbateNet-migr15 is a manually annotated dataset for German which covers the public debate on immigration in 2015. The building block of our annotation is the political science notion of a claim, i.e., a statement made by a political actor (a politician, a party, or a group of citizens) that a specific action should be taken (e.g., vacant flats should be assigned to refugees). We identify claims in newspaper articles, assign them to actors and fine-grained categories and annotate their polarity and date. The aim of this paper is two-fold: first, we release the full DEbateNet-mig15 corpus and document it by means of a quantitative and qualitative analysis; second, we demonstrate its application in a discourse network analysis framework, which enables us to capture the temporal dynamics of the political debate

Swimming with the Tide? Positional Claim Detection across Political Text Types
Nico Blokker | Erenay Dayanik | Gabriella Lapesa | Sebastian Padó
Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science

Manifestos are official documents of political parties, providing a comprehensive topical overview of the electoral programs. Voters, however, seldom read them and often prefer other channels, such as newspaper articles, to understand the party positions on various policy issues. The natural question to ask is how compatible these two formats (manifesto and newspaper reports) are in their representation of party positioning. We address this question with an approach that combines political science (manual annotation and analysis) and natural language processing (supervised claim identification) in a cross-text type setting: we train a classifier on annotated newspaper data and test its performance on manifestos. Our findings show a) strong performance for supervised classification even across text types and b) a substantive overlap between the two formats in terms of party positioning, with differences regarding the salience of specific issues.


Who Sides with Whom? Towards Computational Construction of Discourse Networks for Political Debates
Sebastian Padó | Andre Blessing | Nico Blokker | Erenay Dayanik | Sebastian Haunss | Jonas Kuhn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Understanding the structures of political debates (which actors make what claims) is essential for understanding democratic political decision making. The vision of computational construction of such discourse networks from newspaper reports brings together political science and natural language processing. This paper presents three contributions towards this goal: (a) a requirements analysis, linking the task to knowledge base population; (b) an annotated pilot corpus of migration claims based on German newspaper reports; (c) initial modeling results.

Team Howard Beale at SemEval-2019 Task 4: Hyperpartisan News Detection with BERT
Osman Mutlu | Ozan Arkan Can | Erenay Dayanik
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes our system for SemEval-2019 Task 4: Hyperpartisan News Detection (Kiesel et al., 2019). We use pretrained BERT (Devlin et al., 2018) architecture and investigate the effect of different fine tuning regimes on the final classification task. We show that additional pretraining on news domain improves the performance on the Hyperpartisan News Detection task. Our system ranked 8th out of 42 teams with 78.3% accuracy on the held-out test dataset.