Predicting Twitter User Demographics from Names Alone

Zach Wood-Doughty, Nicholas Andrews, Rebecca Marvin, Mark Dredze


Abstract
Social media analysis frequently requires tools that can automatically infer demographics to contextualize trends. These tools often require hundreds of user-authored messages for each user, which may be prohibitive to obtain when analyzing millions of users. We explore character-level neural models that learn a representation of a user’s name and screen name to predict gender and ethnicity, allowing for demographic inference with minimal data. We release trained models1 which may enable new demographic analyses that would otherwise require enormous amounts of data collection
Anthology ID:
W18-1114
Volume:
Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media
Month:
June
Year:
2018
Address:
New Orleans, Louisiana, USA
Venues:
NAACL | PEOPLES | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
105–111
Language:
URL:
https://www.aclweb.org/anthology/W18-1114
DOI:
10.18653/v1/W18-1114
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-1114.pdf