Using Linguistic Resources to Evaluate the Quality of Annotated Corpora
Abstract
Statistical and neural-network-based methods that compute their results by comparing a given text to be analyzed with a reference corpus assume that the reference corpus is complete and reliable enough. In this article, I conduct several experiments on an extract of the Open American National Corpus to verify this assumption.- Anthology ID:
- W18-3802
- Volume:
- Proceedings of the First Workshop on Linguistic Resources for Natural Language Processing
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Venues:
- COLING | LR4NLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2–11
- Language:
- URL:
- https://www.aclweb.org/anthology/W18-3802
- DOI:
- PDF:
- http://aclanthology.lst.uni-saarland.de/W18-3802.pdf