Anjalie Field


2020

pdf bib
A Generative Approach to Titling and Clustering Wikipedia Sections
Anjalie Field | Sascha Rothe | Simon Baumgartner | Cong Yu | Abe Ittycheriah
Proceedings of the Fourth Workshop on Neural Generation and Translation

We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.

pdf bib
Demoting Racial Bias in Hate Speech Detection
Mengzhou Xia | Anjalie Field | Yulia Tsvetkov
Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media

In the task of hate speech detection, there exists a high correlation between African American English (AAE) and annotators’ perceptions of toxicity in current datasets. This bias in annotated training data and the tendency of machine learning models to amplify it cause AAE text to often be mislabeled as abusive/offensive/hate speech (high false positive rate) by current hate speech classifiers. Here, we use adversarial training to mitigate this bias. Experimental results on one hate speech dataset and one AAE dataset suggest that our method is able to reduce the false positive rate for AAE text with only a minimal compromise on the performance of hate speech classification.

pdf bib
Unsupervised Discovery of Implicit Gender Bias
Anjalie Field | Yulia Tsvetkov
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Despite their prevalence in society, social biases are difficult to identify, primarily because human judgements in this domain can be unreliable. We take an unsupervised approach to identifying gender bias against women at a comment level and present a model that can surface text likely to contain bias. Our main challenge is forcing the model to focus on signs of implicit bias, rather than other artifacts in the data. Thus, our methodology involves reducing the influence of confounds through propensity matching and adversarial learning. Our analysis shows how biased comments directed towards female politicians contain mixed criticisms, while comments directed towards other female public figures focus on appearance and sexualization. Ultimately, our work offers a way to capture subtle biases in various domains without relying on subjective human judgements.

2019

pdf bib
Entity-Centric Contextual Affective Analysis
Anjalie Field | Yulia Tsvetkov
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

While contextualized word representations have improved state-of-the-art benchmarks in many NLP tasks, their potential usefulness for social-oriented tasks remains largely unexplored. We show how contextualized word embeddings can be used to capture affect dimensions in portrayals of people. We evaluate our methodology quantitatively, on held-out affect lexicons, and qualitatively, through case examples. We find that contextualized word representations do encode meaningful affect information, but they are heavily biased towards their training data, which limits their usefulness to in-domain analyses. We ultimately use our method to examine differences in portrayals of men and women.

2018

pdf bib
Framing and Agenda-setting in Russian News: a Computational Analysis of Intricate Political Strategies
Anjalie Field | Doron Kliger | Shuly Wintner | Jennifer Pan | Dan Jurafsky | Yulia Tsvetkov
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Amidst growing concern over media manipulation, NLP attention has focused on overt strategies like censorship and “fake news”. Here, we draw on two concepts from political science literature to explore subtler strategies for government media manipulation: agenda-setting (selecting what topics to cover) and framing (deciding how topics are covered). We analyze 13 years (100K articles) of the Russian newspaper Izvestia and identify a strategy of distraction: articles mention the U.S. more frequently in the month directly following an economic downturn in Russia. We introduce embedding-based methods for cross-lingually projecting English frames to Russian, and discover that these articles emphasize U.S. moral failings and threats to the U.S. Our work offers new ways to identify subtle media manipulation strategies at the intersection of agenda-setting and framing.