Geolocation Prediction in Twitter Using Location Indicative Words and Textual Features
Lianhua Chi | Kwan Hui Lim | Nebula Alam | Christopher J. Butler
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)
Knowing the location of a social media user and their posts is important for various purposes, such as the recommendation of location-based items/services, and locality detection of crisis/disasters. This paper describes our submission to the shared task “Geolocation Prediction in Twitter” of the 2nd Workshop on Noisy User-generated Text. In this shared task, we propose an algorithm to predict the location of Twitter users and tweets using a multinomial Naive Bayes classifier trained on Location Indicative Words and various textual features (such as city/country names, #hashtags and @mentions). We compared our approach against various baselines based on Location Indicative Words, city/country names, #hashtags and @mentions as individual feature sets, and experimental results show that our approach outperforms these baselines in terms of classification accuracy, mean and median error distance.