Adapting predominant and novel sense discovery algorithms for identifying corpus-specific sense differences

Binny Mathew, Suman Kalyan Maity, Pratip Sarkar, Animesh Mukherjee, Pawan Goyal


Abstract
Word senses are not static and may have temporal, spatial or corpus-specific scopes. Identifying such scopes might benefit the existing WSD systems largely. In this paper, while studying corpus specific word senses, we adapt three existing predominant and novel-sense discovery algorithms to identify these corpus-specific senses. We make use of text data available in the form of millions of digitized books and newspaper archives as two different sources of corpora and propose automated methods to identify corpus-specific word senses at various time points. We conduct an extensive and thorough human judgement experiment to rigorously evaluate and compare the performance of these approaches. Post adaptation, the output of the three algorithms are in the same format and the accuracy results are also comparable, with roughly 45-60% of the reported corpus-specific senses being judged as genuine.
Anthology ID:
W17-2402
Volume:
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venues:
TextGraphs | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11–20
Language:
URL:
https://www.aclweb.org/anthology/W17-2402
DOI:
10.18653/v1/W17-2402
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W17-2402.pdf