Comparing Methods for Measuring Dialect Similarity in Norwegian

Janne Johannessen, Andre Kåsen, Kristin Hagen, Anders Nøklestad, Joel Priestley


Abstract
The present article presents four experiments with two different methods for measuring dialect similarity in Norwegian: the Levenshtein method and the neural long short term memory (LSTM) autoencoder network, a machine learning algorithm. The visual output in the form of dialect maps is then compared with canonical maps found in the dialect literature. All of this enables us to say that one does not need fine-grained transcriptions of speech to replicate classical classification patterns.
Anthology ID:
2020.lrec-1.658
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
COLING | LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5343–5350
Language:
English
URL:
https://www.aclweb.org/anthology/2020.lrec-1.658
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.lrec-1.658.pdf