Observational Comparison of Geo-tagged and Randomly-drawn Tweets

Tom Lippincott, Annabelle Carrell


Abstract
Twitter is a ubiquitous source of micro-blog social media data, providing the academic, industrial, and public sectors real-time access to actionable information. A particularly attractive property of some tweets is *geo-tagging*, where a user account has opted-in to attaching their current location to each message. Unfortunately (from a researcher’s perspective) only a fraction of Twitter accounts agree to this, and these accounts are likely to have systematic diffences with the general population. This work is an exploratory study of these differences across the full range of Twitter content, and complements previous studies that focus on the English-language subset. Additionally, we compare methods for querying users by self-identified properties, finding that the constrained semantics of the “description” field provides cleaner, higher-volume results than more complex regular expressions.
Anthology ID:
W18-1107
Volume:
Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media
Month:
June
Year:
2018
Address:
New Orleans, Louisiana, USA
Venues:
NAACL | PEOPLES | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
50–55
Language:
URL:
https://www.aclweb.org/anthology/W18-1107
DOI:
10.18653/v1/W18-1107
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W18-1107.pdf