Carmen Garcia-Mateo

Also published as: Carmen García-Mateo


2020

pdf bib
LSE_UVIGO: A Multi-source Database for Spanish Sign Language Recognition
Laura Docío-Fernández | José Luis Alba-Castro | Soledad Torres-Guijarro | Eduardo Rodríguez-Banga | Manuel Rey-Area | Ania Pérez-Pérez | Sonia Rico-Alonso | Carmen García-Mateo
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives

This paper presents LSE_UVIGO, a multi-source database designed to foster research on Sign Language Recognition. It is being recorded and compiled for Spanish Sign Language (LSE acronym in Spanish) and contains also spoken Galician language, so it is very well fitted to research on these languages, but also quite useful for fundamental research in any other sign language. LSE_UVIGO is composed of two datasets: LSE_Lex40_UVIGO, a multi-sensor and multi-signer dataset acquired from scratch, designed as an incremental dataset, both in complexity of the visual content and in the variety of signers. It contains static and co-articulated sign recordings, fingerspelled and gloss-based isolated words, and sentences. Its acquisition is done in a controlled lab environment in order to obtain good quality videos with sharp video frames and RGB and depth information, making them suitable to try different approaches to automatic recognition. The second subset, LSE_TVGWeather_UVIGO is being populated from the regional television weather forecasts interpreted to LSE, as a faster way to acquire high quality, continuous LSE recordings with a domain-restricted vocabulary and with a correspondence to spoken sentences.

2014

pdf bib
Introducing a Framework for the Evaluation of Music Detection Tools
Paula Lopez-Otero | Laura Docio-Fernandez | Carmen Garcia-Mateo
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The huge amount of multimedia information available nowadays makes its manual processing prohibitive, requiring tools for automatic labelling of these contents. This paper describes a framework for assessing a music detection tool; this framework consists of a database, composed of several hours of radio recordings that include different types of radio programmes, and a set of evaluation measures for evaluating the performance of a music detection tool in detail. A tool for automatically detecting music in audio streams, with application to music information retrieval tasks, is presented as well. The aim of this tool is to discard the audio excerpts that do not contain music in order to avoid their unnecessary processing. This tool applies fingerprinting to different acoustic features extracted from the audio signal in order to remove perceptual irrelevancies, and a support vector machine is trained for classifying these fingerprints in classes music and no-music. The validity of this tool is assessed in the proposed evaluation framework.

pdf bib
The Strategic Impact of META-NET on the Regional, National and International Level
Georg Rehm | Hans Uszkoreit | Sophia Ananiadou | Núria Bel | Audronė Bielevičienė | Lars Borin | António Branco | Gerhard Budin | Nicoletta Calzolari | Walter Daelemans | Radovan Garabík | Marko Grobelnik | Carmen García-Mateo | Josef van Genabith | Jan Hajič | Inma Hernáez | John Judge | Svetla Koeva | Simon Krek | Cvetana Krstev | Krister Lindén | Bernardo Magnini | Joseph Mariani | John McNaught | Maite Melero | Monica Monachini | Asunción Moreno | Jan Odijk | Maciej Ogrodniczuk | Piotr Pęzik | Stelios Piperidis | Adam Przepiórkowski | Eiríkur Rögnvaldsson | Michael Rosner | Bolette Pedersen | Inguna Skadiņa | Koenraad De Smedt | Marko Tadić | Paul Thompson | Dan Tufiş | Tamás Váradi | Andrejs Vasiļjevs | Kadri Vider | Jolanta Zabarskaite
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.

pdf bib
CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
Carmen García-Mateo | Antonio Cardenal | Xosé Luis Regueira | Elisa Fernández Rei | Marta Martinez | Roberto Seara | Rocío Varela | Noemí Basanta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the CORILGA (“Corpus Oral Informatizado da Lingua Galega”). CORILGA is a large high-quality corpus of spoken Galician from the 1960s up to present-day, including both formal and informal spoken language from both standard and non-standard varieties, and across different generations and social levels. The corpus will be available to the research community upon completion. Galician is one of the EU languages that needs further research before highly effective language technology solutions can be implemented. A software repository for speech resources in Galician is also described. The repository includes a structured database, a graphical interface and processing tools. The use of a database enables to perform search in a simple and fast way based in a number of different criteria. The web-based user interface facilitates users the access to the different materials. Last but not least a set of transcription-based modules for automatic speech recognition has been developed, thus facilitating the orthographic labelling of the recordings.

2010

pdf bib
Building High Quality Databases for Minority Languages such as Galician
Francisco Campillo | Daniela Braga | Ana Belén Mourín | Carmen García-Mateo | Pedro Silva | Miguel Sales Dias | Francisco Méndez
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes the result of a joint R&D project between Microsoft Portugal and the Signal Theory Group of the University of Vigo (Spain), where a set of language resources was developed with application to Text―to―Speech synthesis. First, a large Corpus of 10000 Galician sentences was designed and recorded by a professional female speaker. Second, a lexicon with phonetic and grammatical information of over 90000 entries was collected and reviewed manually by a linguist expert. And finally, these resources were used for a MOS (Mean Opinion Score) perceptual test to compare two state―of―the―art speech synthesizers of both groups, the one from Microsoft based on HMM, and the one from the University of Vigo based on unit selection.

2004

pdf bib
The COST278 Pan-European Broadcast News Database
An Vandecatseye | Jean-Pierre Martens | Joao Neto | Hugo Meinedo | Carmen Garcia-Mateo | Javier Dieguez | France Mihelic | Janez Zibert | Jan Nouza | Petr David | Matus Pleva | Anton Cizmar | Harris Papageorgiou | Christina Alexandris
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Transcrigal: A Bilingual System for Automatic Indexing of Broadcast News
Carmen Garcia-Mateo | Javier Dieguez-Tirado | Laura Docio-Fernandez | Antonio Cardenal-Lopez
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf bib
Acoustic Modeling and Training of a Bilingual ASR System when a Minority Language is Involved
Laura Docío-Fernández | Carmen García-Mateo
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)