On maximizing metrics for syntactic disambiguation

Khalil Sima’an


Abstract
Given a probabilistic parsing model and an evaluation metric for scoring the match between parse-trees, e.g., PARSEVAL [Black et al., 1991], this paper addresses the problem of how to select the on average best scoring parse-tree for an input sentence. Common wisdom dictates that it is optimal to select the parse with the highest probability, regardless of the evaluation metric. In contrast, the Maximizing Metrics (MM) method [Goodman, 1998, Stolcke et al., 1997] proposes that an algorithm that optimizes the evaluation metric itself constitutes the optimal choice. We study the MM method within parsing. We observe that the MM does not always hold for tree-bank models, and that optimizing weak metrics is not interesting for semantic processing. Subsequently, we state an alternative proposition: the optimal algorithm must maximize the metric that scores parse-trees according to linguistically relevant features. We present new algorithms that optimize metrics that take into account increasingly more linguistic features, and exhibit experiments in support of our claim.
Anthology ID:
W03-3021
Volume:
Proceedings of the Eighth International Conference on Parsing Technologies
Month:
April
Year:
2003
Address:
Nancy, France
Venues:
IWPT | WS
SIG:
SIGPARSE
Publisher:
Note:
Pages:
183–194
Language:
URL:
https://www.aclweb.org/anthology/W03-3021
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W03-3021.pdf