Understanding Script-Mixing: A Case Study of Hindi-English Bilingual Twitter Users

Abhishek Srivastava, Kalika Bali, Monojit Choudhury


Abstract
In a multi-lingual and multi-script society such as India, many users resort to code-mixing while typing on social media. While code-mixing has received a lot of attention in the past few years, it has mostly been studied within a single-script scenario. In this work, we present a case study of Hindi-English bilingual Twitter users while considering the nuances that come with the intermixing of different scripts. We present a concise analysis of how scripts and languages interact in communities and cultures where code-mixing is rampant and offer certain insights into the findings. Our analysis shows that both intra-sentential and inter-sentential script-mixing are present on Twitter and show different behavior in different contexts. Examples suggest that script can be employed as a tool for emphasizing certain phrases within a sentence or disambiguating the meaning of a word. Script choice can also be an indicator of whether a word is borrowed or not. We present our analysis along with examples that bring out the nuances of the different cases.
Anthology ID:
2020.calcs-1.5
Volume:
Proceedings of the The 4th Workshop on Computational Approaches to Code Switching
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
CALCS | LREC | WS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
36–44
Language:
English
URL:
https://www.aclweb.org/anthology/2020.calcs-1.5
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.calcs-1.5.pdf