Predicting Degrees of Technicality in Automatic Terminology Extraction

Anna Hätty, Dominik Schlechtweg, Michael Dorna, Sabine Schulte im Walde


Abstract
While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on technicality prediction. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparative embeddings. We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multi-channel model performing best.
Anthology ID:
2020.acl-main.258
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2883–2889
Language:
URL:
https://www.aclweb.org/anthology/2020.acl-main.258
DOI:
10.18653/v1/2020.acl-main.258
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.acl-main.258.pdf
Video:
 http://slideslive.com/38928698