GikiCLEF: Crosscultural Issues in Multilingual Information Access

Diana Santos, Luís Miguel Cabral, Corina Forascu, Pamela Forner, Fredric Gey, Katrin Lamm, Thomas Mandl, Petya Osenova, Anselmo Peñas, Álvaro Rodrigo, Julia Schulz, Yvonne Skalban, Erik Tjong Kim Sang


Abstract
In this paper we describe GikiCLEF, the first evaluation contest that, to our knowledge, was specifically designed to expose and investigate cultural and linguistic issues involved in structured multimedia collections and searching, and which was organized under the scope of CLEF 2009. GikiCLEF evaluated systems that answered hard questions for both human and machine, in ten different Wikipedia collections, namely Bulgarian, Dutch, English, German, Italian, Norwegian (Bokmäl and Nynorsk), Portuguese, Romanian, and Spanish. After a short historical introduction, we present the task, together with its motivation, and discuss how the topics were chosen. Then we provide another description from the point of view of the participants. Before disclosing their results, we introduce the SIGA management system explaining the several tasks which were carried out behind the scenes. We quantify in turn the GIRA resource, offered to the community for training and further evaluating systems with the help of the 50 topics gathered and the solutions identified. We end the paper with a critical discussion of what was learned, advancing possible ways to reuse the data.
Anthology ID:
L10-1186
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/272_Paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/272_Paper.pdf