Arantza Casillas

Also published as: A. Casillas


2016

pdf bib
The impact of simple feature engineering in multilingual medical NER
Rebecka Weegar | Arantza Casillas | Arantza Diaz de Ilarraza | Maite Oronoz | Alicia Pérez | Koldo Gojenola
Proceedings of the Clinical Natural Language Processing Workshop (ClinicalNLP)

The goal of this paper is to examine the impact of simple feature engineering mechanisms before applying more sophisticated techniques to the task of medical NER. Sometimes papers using scientifically sound techniques present raw baselines that could be improved adding simple and cheap features. This work focuses on entity recognition for the clinical domain for three languages: English, Swedish and Spanish. The task is tackled using simple features, starting from the window size, capitalization, prefixes, and moving to POS and semantic tags. This work demonstrates that a simple initial step of feature engineering can improve the baseline results significantly. Hence, the contributions of this paper are: first, a short list of guidelines well supported with experimental results on three languages and, second, a detailed description of the relevance of these features for medical NER.

pdf bib
Fully unsupervised low-dimensional representation of adverse drug reaction events through distributional semantics
Alicia Pérez | Arantza Casillas | Koldo Gojenola
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)

Electronic health records show great variability since the same concept is often expressed with different terms, either scientific latin forms, common or lay variants and even vernacular naming. Deep learning enables distributional representation of terms in a vector-space, and therefore, related terms tend to be close in the vector space. Accordingly, embedding words through these vectors opens the way towards accounting for semantic relatedness through classical algebraic operations. In this work we propose a simple though efficient unsupervised characterization of Adverse Drug Reactions (ADRs). This approach exploits the embedding representation of the terms involved in candidate ADR events, that is, drug-disease entity pairs. In brief, the ADRs are represented as vectors that link the drug with the disease in their context through a recursive additive model. We discovered that a low-dimensional representation that makes use of the modulus and argument of the embedded representation of the ADR event shows correlation with the manually annotated class. Thus, it can be derived that this characterization results in to be beneficial for further classification tasks as predictive features.

2014

pdf bib
IxaMed: Applying Freeling and a Perceptron Sequential Tagger at the Shared Task on Analyzing Clinical Texts
Koldo Gojenola | Maite Oronoz | Alicia Pérez | Arantza Casillas
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Adverse Drug Event prediction combining shallow analysis and machine learning
Sara Santiso | Arantza Casillas | Alicia Pérez | Maite Oronoz | Koldo Gojenola
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)

2012

pdf bib
First Approaches on Spanish Medical Record Classification Using Diagnostic Term to Class Transduction
A. Casillas | A. Díaz de Ilarraza | K. Gojenola | M. Oronoz | Alicia Pérez
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

2011

pdf bib
Using Kybots for Extracting Events in Biomedical Texts
Arantza Casillas | Arantza Díaz de Ilarraza | Koldo Gojenola | Maite Oronoz | German Rigau
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Testing the Effect of Morphological Disambiguation in Dependency Parsing of Basque
Kepa Bengoetxea | Arantza Casillas | Koldo Gojenola
Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages

2006

pdf bib
Multilingual Document Clustering: An Heuristic Approach Based on Cognate Named Entities
Soto Montalvo | Raquel Martínez | Arantza Casillas | Víctor Fresno
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2000

pdf bib
DTD-driven bilingual document generation
Arantza Casillas | Joseba Abaitua | Raquel Martínez
INLG’2000 Proceedings of the First International Conference on Natural Language Generation

1998

pdf bib
Bitext Correspondences through Rich Mark-up
Raquel Martinez | Joseba Abaitua | Arantza Casillas
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
Bitext Correspondences through Rich Mark-up
Raquel Martinez | Joseba Abaitua | Arantza Casillas
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
Aligning tagged bitexts
Raquel Martinez | Joseba Abaitua | Arantza Casillas
Sixth Workshop on Very Large Corpora