Local-Global Vectors to Improve Unigram Terminology Extraction

Ehsan Amjadian, Diana Inkpen, Tahereh Paribakht, Farahnaz Faez


Abstract
The present paper explores a novel method that integrates efficient distributed representations with terminology extraction. We show that the information from a small number of observed instances can be combined with local and global word embeddings to remarkably improve the term extraction results on unigram terms. To do so we pass the terms extracted by other tools to a filter made of the local-global embeddings and a classifier which in turn decides whether or not a term candidate is a term. The filter can also be used as a hub to merge different term extraction tools into a single higher-performing system. We compare filters that use the skip-gram architecture and filters that employ the CBOW architecture for the task at hand.
Anthology ID:
W16-4702
Volume:
Proceedings of the 5th International Workshop on Computational Terminology (Computerm2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
CompuTerm | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
2–11
Language:
URL:
https://www.aclweb.org/anthology/W16-4702
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W16-4702.pdf