Linguistically Light Lexical Extensions for Ontologies

Brian Davis, Siegfried Handschuh, Alexander Troussov, John Judge, Mikhail Sogrin


Abstract
The identification of class instances within unstructured text for either the purposes of Ontology population or semantic annotation are usually limited to term mentions of Proper Noun and Personal Noun or fixed Key Phrases within Text Analytics or Ontology based Information Extraction(OBIE) applications. These systems do not generalize to cope with compound nominal classes of multi word expressions. Computational Linguistics’ approaches involving deep analysis tend to suffer from idiomaticity and overgeneration problems while the shallower “words with spaces” approach frequently employed in Information Extraction(IE) and Industrial Text Analytics systems lacks flexibility and is prone to lexical proliferation. We outline a representation for encoding light linguistic features of Compound Nominal term mentions of Concepts within an Ontology as well as a lightweight semantic annotator which complies the above linguistic information into efficient Dictionary formats to drive large scale identification and semantic annotation of the aforementioned concepts.
Anthology ID:
L08-1558
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/911_paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/911_paper.pdf