Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks

Afshin Rahimi, Timothy Baldwin, Trevor Cohn


Abstract
We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.
Anthology ID:
D17-1016
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–176
Language:
URL:
https://www.aclweb.org/anthology/D17-1016
DOI:
10.18653/v1/D17-1016
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D17-1016.pdf
Video:
 https://vimeo.com/238228698