In recent years, referring expression genera- tion algorithms were inspired by game theory and probability theory. In this paper, an al- gorithm is designed for the generation of re- ferring expressions (REG) that base on both models by integrating maximization of utilities into the content determination process. It im- plements cognitive models for assessing visual salience of objects and additional features. In order to evaluate the algorithm properly and validate the applicability of existing models and evaluative information criteria, both, pro- duction and comprehension studies, are con- ducted using a complex domain of objects, pro- viding new directions of approaching the eval- uation of REG algorithms.
In recent years, Bayesian models of referring expression generation have gained prominence in order to produce situationally more adequate referring expressions. Basically, these models enable the integration of different parameters into the decision process for using a specific referring expression like the cardinality of the object set, the configuration and complexity of the visual field, and the discriminatory power of available attributes that need to be combined with visual salience and personal preference. This paper describes and discusses the results of an empirical study on the production of referring expressions in visual fields with different object configurations of varying complexity and different contextual premises for using a referring expression. The visual fields are set up using data from the TUNA experiment with plain random or pragmatically enriched configurations which allow for target inference. Different categories of the situational contexts, in which the referring expressions are produced, provide different degrees of cooperativeness, so that generation quality and its relations to contextual user intention can be observed. The results of the study suggest that Bayesian approaches must integrate individual generation preference and the cooperativeness of the situational task in order to model the broad variance between speakers more adequately.