汉语块依存语法与树库构建(Chinese Chunk-Based Dependency Grammar and Treebank construction)

Qingqing Qian (钱青青), Chengwen Wang (王诚文)


Abstract
本研究依据以谓词为核心的块依存语法构建块依存树库,在句内和句间寻找谓词所支配的组块,利用汉语中组块和组块间的依存关系补全缺省部分,明确谓词支配关系。目前共标注2199篇文本,涵盖百科、新闻两个领域,共约187万字语料。本文简述了块依存语法的原则,并对组块及其依存关系进行了定义。将详细介绍标注流程、标注一致率、数据分布等情况。基于现有的树库,本研究发现汉语中有约25%的小句是非自足的,约有88%的核心谓词可支配1~3个从属成分。
Anthology ID:
2020.ccl-1.53
Volume:
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Month:
October
Year:
2020
Address:
Haikou, China
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
572–580
Language:
Chinese
URL:
https://www.aclweb.org/anthology/2020.ccl-1.53
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.ccl-1.53.pdf