Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning

Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy


Abstract
Twitter should be an ideal place to get a fresh read on how different issues are playing with the public, one that’s potentially more reflective of democracy in this new media age than traditional polls. Pollsters typically ask people a fixed set of questions, while in social media people use their own voices to speak about whatever is on their minds. However, the demographic distribution of users on Twitter is not representative of the general population. In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter. We collected and curated a robust Twitter demographic dataset for this task. Our classifier uses a deep multi-modal multi-task learning architecture to reach a state-of-the-art performance, achieving an F1-score of 0.89, 0.82, 0.86, and 0.68 for gender, age, political orientation, and location respectively.
Anthology ID:
P17-2076
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
478–483
Language:
URL:
https://www.aclweb.org/anthology/P17-2076
DOI:
10.18653/v1/P17-2076
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/P17-2076.pdf