Martha Larson


pdf bib
The Connection between the Text and Images of News Articles: New Insights for Multimedia Analysis
Nelleke Oostdijk | Hans van Halteren | Erkan Bașar | Martha Larson
Proceedings of the 12th Language Resources and Evaluation Conference

We report on a case study of text and images that reveals the inadequacy of simplistic assumptions about their connection and interplay. The context of our work is a larger effort to create automatic systems that can extract event information from online news articles about flooding disasters. We carry out a manual analysis of 1000 articles containing a keyword related to flooding. The analysis reveals that the articles in our data set cluster into seven categories related to different topical aspects of flooding, and that the images accompanying the articles cluster into five categories related to the content they depict. The results demonstrate that flood-related news articles do not consistently report on a single, currently unfolding flooding event and we should also not assume that a flood-related image will directly relate to a flooding-event described in the corresponding article. In particular, spatiotemporal distance is important. We validate the manual analysis with an automatic classifier demonstrating the technical feasibility of multimedia analysis approaches that admit more realistic relationships between text and images. In sum, our case study confirms that closer attention to the connection between text and images has the potential to improve the collection of multimodal information from news articles.

pdf bib
Truth or Error? Towards systematic analysis of factual errors in abstractive summaries
Klaus-Michael Lux | Maya Sappelli | Martha Larson
Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems

This paper presents a typology of errors produced by automatic summarization systems. The typology was created by manually analyzing the output of four recent neural summarization systems. Our work is motivated by the growing awareness of the need for better summary evaluation methods that go beyond conventional overlap-based metrics. Our typology is structured into two dimensions. First, the Mapping Dimension describes surface-level errors and provides insight into word-sequence transformation issues. Second, the Meaning Dimension describes issues related to interpretation and provides insight into breakdowns in truth, i.e., factual faithfulness to the original text. Comparative analysis revealed that two neural summarization systems leveraging pre-trained models have an advantage in decreasing grammaticality errors, but not necessarily factual errors. We also discuss the importance of ensuring that summary length and abstractiveness do not interfere with evaluating summary quality.


pdf bib
Creating a Data Collection for Evaluating Rich Speech Retrieval
Maria Eskevich | Gareth J.F. Jones | Martha Larson | Roeland Ordelman
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (, and was provided by the MediaEval benchmarking initiative. A crowdsourcing approach was used to identify segments in the video data which contain speech acts, to create a description of the video containing the act and to generate search queries designed to refind this speech act. We describe and reflect on our experiences with crowdsourcing this test collection using the Amazon Mechanical Turk platform. We highlight the challenges of constructing this dataset, including the selection of the data source, design of the crowdsouring task and the specification of queries and relevant items.


pdf bib
Creation of an Annotated German Broadcast Speech Database for Spoken Document Retrieval
Stefan Eickeler | Martha Larson | Wolff Rüter | Joachim Köhler
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)