Learning Subjectivity Phrases missing from Resources through a Large Set of Semantic Tests

Matthieu Vernier, Laura Monceaux, Béatrice Daille


Abstract
In recent years, blogs and social networks have particularly boosted interests for opinion mining research. In order to satisfy real-scale applicative needs, a main task is to create or to enhance lexical and semantic resources on evaluative language. Classical resources of the area are mostly built for english, they contain simple opinion word markers and are far to cover the lexical richness of this linguistic phenomenon. In particular, infrequent subjective words, idiomatic expressions, and cultural stereotypes are missing from resources. We propose a new method, applied on french, to enhance automatically an opinion word lexicon. This learning method relies on linguistic uses of internet users and on semantic tests to infer the degree of subjectivity of many new adjectives, nouns, verbs, noun phrases, verbal phrases which are usually forgotten by other resources. The final appraisal lexicon contains 3,456 entries. We evaluate the lexicon enhancement with and without textual context.
Anthology ID:
L10-1141
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/208_Paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/208_Paper.pdf