Gerhard Heyer


2018

pdf bib
ILCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data
Andreas Niekler | Arnim Bleier | Christian Kahmann | Lisa Posch | Gregor Wiedemann | Kenan Erdogan | Gerhard Heyer | Markus Strohmaier
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Page Stream Segmentation with Convolutional Neural Nets Combining Textual and Visual Features
Gregor Wiedemann | Gerhard Heyer
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2014

pdf bib
A New Implementation for Canonical Text Services
Jochen Tiepmar | Christoph Teichmann | Gerhard Heyer | Monica Berti | Gregory Crane
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

2010

pdf bib
SentiWS - A Publicly Available German-language Resource for Sentiment Analysis
Robert Remus | Uwe Quasthoff | Gerhard Heyer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

SentimentWortschatz, or SentiWS for short, is a publicly available German-language resource for sentiment analysis, opinion mining etc. It lists positive and negative sentiment bearing words weighted within the interval of [-1; 1] plus their part of speech tag, and if applicable, their inflections. The current version of SentiWS (v1.8b) contains 1,650 negative and 1,818 positive words, which sum up to 16,406 positive and 16,328 negative word forms, respectively. It not only contains adjectives and adverbs explicitly expressing a sentiment, but also nouns and verbs implicitly containing one. The present work describes the resource’s structure, the three sources utilised to assemble it and the semi-supervised method incorporated to weight the strength of its entries. Furthermore the resource’s contents are extensively evaluated using a German-language evaluation set we constructed. The evaluation set is verified being reliable and its shown that SentiWS provides a beneficial lexical resource for German-language sentiment analysis related tasks to build on.

2008

pdf bib
ASV Toolbox: a Modular Collection of Language Exploration Tools
Chris Biemann | Uwe Quasthoff | Gerhard Heyer | Florian Holz
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

ASV Toolbox is a modular collection of tools for the exploration of written language data both for scientific and educational purposes. It includes modules that operate on word lists or texts and allow to perform various linguistic annotation, classification and clustering tasks, including language detection, POS-tagging, base form reduction, named entity recognition, and terminology extraction. On a more abstract level, the algorithms deal with various kinds of word similarity, using pattern-based and statistical approaches. The collection can be used to work on large real-world data sets as well as for studying the underlying algorithms. Each module of the ASV Toolbox is designed to work either on a plain text files or with a connection to a MySQL database. While it is especially designed to work with corpora of the Leipzig Corpora Collection, it can easily be adapted to other sources.

pdf bib
Tapping Huge Temporally Indexed Textual Resources with WCTAnalyze
Sebastian Gottwald | Matthias Richter | Gerhard Heyer | Gerik Scheuermann
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

WCTAnalyze is a tool for storing, accessing and visually analyzing huge collections of temporally indexed data. It is motivated by applications in media analysis, business intelligence etc. where higher level analysis is performed on top of linguistically and statistically processed unstructured textual data. WCTAnalyze combines fast access with economically storage behaviour and appropriates a lot of built in visualization options for result presentation in detail as well as in contrast. So it enables an efficient and effective way to explore chronological text patterns of word forms, their co-occurrence sets and co-occurrence set intersections. Digging deep into co-occurrences of the same semantic or syntactic describing wordforms, some entities can be recognized as to be temporal related, whereas other differ significantly. This behaviour motivates approaches in interactive discovering events based on co-occurrence subsets.

2002

pdf bib
Information Extraction from Text Corpora: Using Filters on Collocation Sets
Gerhard Heyer | Uwe Quasthoff | Christian Wolff
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

1992

pdf bib
Extended Spelling Correction for German
Ralf Kese | Friedrich Dudda | Gerhard Heyer | Marianne Kugler
Third Conference on Applied Natural Language Processing

1990

pdf bib
Knowledge Representation and Semantics in a Complex Domain: The UNIX Natural Language Help System GOETHE
Gerhard Heyer | Ralf Kese | Frank Oemig | Friedrich Dudda
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics