HMSid and HMSid2 at PARSEME Shared Task 2020: Computational Corpus Linguistics and unseen-in-training MWEs

Jean-Pierre Colson


Abstract
This paper is a system description of HMSid, officially sent to the PARSEME Shared Task 2020 for one language (French), in the open track. It also describes HMSid2, sent to the organ-izers of the workshop after the deadline and using the same methodology but in the closed track. Both systems do not rely on machine learning, but on computational corpus linguistics. Their score for unseen MWEs is very promising, especially in the case of HMSid2, which would have received the best score for unseen MWEs in the French closed track.
Anthology ID:
2020.mwe-1.15
Volume:
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
Month:
December
Year:
2020
Address:
online
Venues:
COLING | MWE
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–123
Language:
URL:
https://www.aclweb.org/anthology/2020.mwe-1.15
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.mwe-1.15.pdf