Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
Afshin Rahimi, Timothy Baldwin, Trevor Cohn
Abstract
We propose a method for embedding two-dimensional locations in a continuous vector space using a neural network-based model incorporating mixtures of Gaussian distributions, presenting two model variants for text-based geolocation and lexical dialectology. Evaluated over Twitter data, the proposed model outperforms conventional regression-based geolocation and provides a better estimate of uncertainty. We also show the effectiveness of the representation for predicting words from location in lexical dialectology, and evaluate it using the DARE dataset.- Anthology ID:
- D17-1016
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 167–176
- Language:
- URL:
- https://www.aclweb.org/anthology/D17-1016
- DOI:
- 10.18653/v1/D17-1016
- PDF:
- http://aclanthology.lst.uni-saarland.de/D17-1016.pdf