BUCEADOR, a multi-language search engine for digital libraries

Jordi Adell, Antonio Bonafonte, Antonio Cardenal, Marta R. Costa-Jussà, José A. R. Fonollosa, Asunción Moreno, Eva Navas, Eduardo R. Banga


Abstract
This paper presents a web-based multimedia search engine built within the Buceador (www.buceador.org) research project. A proof-of-concept tool has been implemented which is able to retrieve information from a digital library made of multimedia documents in the 4 official languages in Spain (Spanish, Basque, Catalan and Galician). The retrieved documents are presented in the user language after translation and dubbing (the four previous languages + English). The paper presents the tool functionality, the architecture, the digital library and provide some information about the technology involved in the fields of automatic speech recognition, statistical machine translation, text-to-speech synthesis and information retrieval. Each technology has been adapted to the purposes of the presented tool as well as to interact with the rest of the technologies involved.
Anthology ID:
L12-1493
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1705–1709
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/828_Paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/828_Paper.pdf