Using OntoLex-Lemon for Representing and Interlinking Lexicographic Collections of Bavarian Dialects

Yalemisew Abgaz


Abstract
This paper describes the ongoing work in converting the lexicographic collection of a non-standard German language dataset (Bavarian Dialects) into a Linguistic Linked Open Data (LLOD) format. The collection is divided into two: questionnaire dataset (DBÖ) which contains details of the questionnaires, questions, collectors, paper slips etc., and the lexical dataset (WBÖ) which contains lexical entries (answers) collected in response to the questions. In its current form, the lexical dataset is available in a TEI/XML format separately from the questionnaire dataset. This paper presents the mapping of the lexical entries in the TEI/XML format into LLOD format using the Ontolex-Lemon model. The paper shows how the data in TEI/XML format is transformed into LLOD and produces a lexicon for Bavarian Dialects. It also presents the approach used to interlink the original questions with the lexical entries. The resulting lexicon complements the questionnaire dataset, which is already in a LLOD format, by semantically interlinking the original questions with the answers and vice-versa.
Anthology ID:
2020.ldl-1.9
Volume:
Proceedings of the 7th Workshop on Linked Data in Linguistics (LDL-2020)
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
LDL | LREC | WS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
61–69
Language:
English
URL:
https://www.aclweb.org/anthology/2020.ldl-1.9
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.ldl-1.9.pdf