Emily Ahn


2020

pdf bib
Understanding Linguistic Accommodation in Code-Switched Human-Machine Dialogues
Tanmay Parekh | Emily Ahn | Yulia Tsvetkov | Alan W Black
Proceedings of the 24th Conference on Computational Natural Language Learning

Code-switching is a ubiquitous phenomenon in multilingual communities. Natural language technologies that wish to communicate like humans must therefore adaptively incorporate code-switching techniques when they are deployed in multilingual settings. To this end, we propose a Hindi-English human-machine dialogue system that elicits code-switching conversations in a controlled setting. It uses different code-switching agent strategies to understand how users respond and accommodate to the agent’s language choice. Through this system, we collect and release a new dataset CommonDost, comprising of 439 human-machine multilingual conversations. We adapt pre-defined metrics to discover linguistic accommodation from users to agents. Finally, we compare these dialogues with Spanish-English dialogues collected in a similar setting, and analyze the impact of linguistic and socio-cultural factors on code-switching patterns across the two language pairs.

pdf bib
What Code-Switching Strategies are Effective in Dialog Systems?
Emily Ahn | Cecilia Jimenez | Yulia Tsvetkov | Alan W Black
Proceedings of the Society for Computation in Linguistics 2020

2019

pdf bib
Finding Microaggressions in the Wild: A Case for Locating Elusive Phenomena in Social Media Posts
Luke Breitfeller | Emily Ahn | David Jurgens | Yulia Tsvetkov
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Microaggressions are subtle, often veiled, manifestations of human biases. These uncivil interactions can have a powerful negative impact on people by marginalizing minorities and disadvantaged groups. The linguistic subtlety of microaggressions in communication has made it difficult for researchers to analyze their exact nature, and to quantify and extract microaggressions automatically. Specifically, the lack of a corpus of real-world microaggressions and objective criteria for annotating them have prevented researchers from addressing these problems at scale. In this paper, we devise a general but nuanced, computationally operationalizable typology of microaggressions based on a small subset of data that we have. We then create two datasets: one with examples of diverse types of microaggressions recollected by their targets, and another with gender-based microaggressions in public conversations on social media. We introduce a new, more objective, criterion for annotation and an active-learning based procedure that increases the likelihood of surfacing posts containing microaggressions. Finally, we analyze the trends that emerge from these new datasets.

2016

pdf bib
Improving Fluency in Narrative Text Generation With Grammatical Transformations and Probabilistic Parsing
Emily Ahn | Fabrizio Morbini | Andrew Gordon
Proceedings of the 9th International Natural Language Generation conference