Challenges for Making Use of a Large Text Corpus such as the ‘AACAustrian Academy Corpus’ for Digital Literary Studies

Hanno Biber


Abstract
The challenges for making use of a large text corpus such as the ‘AAC – Austrian Academy Corpus’ for the purposes of digital literary studies will be addressed in this presentation. The research question of how to use a digital text corpus of considerable size for such a specific research purpose is of interest for corpus research in general as it is of interest for digital literary text studies which rely to a large extent on large digital text corpora. The observations of the usage of lexical entities such as words, word forms, multi word units and many other linguistic units determine the way in which texts are being studied and explored. Larger entities have to be taken into account as well, which is why questions of semantic analysis and larger structures come into play. The texts of the AAC – Austrian Academy Corpus which was founded in 2001 are German language texts of historical and cultural significance from the time between 1848 and 1989. The aim of this study is to present possible research questions for corpus-based methodological approaches for the digital study of literary texts and to give examples of early experiments and experiences with making use of a large text corpus for these research purposes.
Anthology ID:
2020.cmlc-1.7
Volume:
Proceedings of the 8th Workshop on Challenges in the Management of Large Corpora
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
CMLC | LREC | WS
SIG:
Publisher:
European Language Ressources Association
Note:
Pages:
47–51
Language:
English
URL:
https://www.aclweb.org/anthology/2020.cmlc-1.7
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.cmlc-1.7.pdf