Alicia Pérez


2016

pdf bib
The impact of simple feature engineering in multilingual medical NER
Rebecka Weegar | Arantza Casillas | Arantza Diaz de Ilarraza | Maite Oronoz | Alicia Pérez | Koldo Gojenola
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)

The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER.

pdf bib
Fully unsupervised low-dimensional representation of adverse drug reaction events through distributional semantics
Alicia Pérez | Arantza Casillas | Koldo Gojenola
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)

Electronic health records show great variability since the same concept is often expressed with different terms, either scientific latin forms, common or lay variants and even vernacular naming. Deep learning enables distributional representation of terms in a vector-space, and therefore, related terms tend to be close in the vector space. Accordingly, embedding words through these vectors opens the way towards accounting for semantic relatedness through classical algebraic operations. In this work we propose a simple though efficient unsupervised characterization of Adverse Drug Reactions (ADRs). This approach exploits the embedding representation of the terms involved in candidate ADR events, that is, drug-disease entity pairs. In brief, the ADRs are represented as vectors that link the drug with the disease in their context through a recursive additive model. We discovered that a low-dimensional representation that makes use of the modulus and argument of the embedded representation of the ADR event shows correlation with the manually annotated class. Thus, it can be derived that this characterization results in to be beneficial for further classification tasks as predictive features.

2014

pdf bib
IxaMed: Applying Freeling and a Perceptron Sequential Tagger at the Shared Task on Analyzing Clinical Texts
Koldo Gojenola | Maite Oronoz | Alicia Pérez | Arantza Casillas
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Adverse Drug Event prediction combining shallow analysis and machine learning
Sara Santiso | Arantza Casillas | Alicia Pérez | Maite Oronoz | Koldo Gojenola
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)

2012

pdf bib
First Approaches on Spanish Medical Record Classification Using Diagnostic Term to Class Transduction
A. Casillas | A. Díaz de Ilarraza | K. Gojenola | M. Oronoz | Alicia Pérez
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

pdf bib
Finite-State Acoustic and Translation Model Composition in Statistical Speech Translation: Empirical Assessment
Alicia Pérez | M. Inés Torres | Francisco Casacuberta
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

2010

pdf bib
Potential scope of a fully-integrated architecture for speech translation
Alicia Pérez | María Inés Torres | Francisco Casacuberta
Proceedings of the 14th Annual conference of the European Association for Machine Translation

2007

pdf bib
An Integrated Architecture for Speech-Input Multi-Target Machine Translation
Alicia Pérez | M. Teresa González | M. Inés Torres | Francisco Casacuberta
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf bib
Speech-Input Multi-Target Machine Translation
Alicia Pérez | M. Teresa González | M. Inés Torres | Francisco Casacuberta
Proceedings of the Second Workshop on Statistical Machine Translation