Enrique Noriega-Atala


pdf bib
Understanding the Polarity of Events in the Biomedical Literature: Deep Learning vs. Linguistically-informed Methods
Enrique Noriega-Atala | Zhengzhong Liang | John Bachman | Clayton Morrison | Mihai Surdeanu
Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications

An important task in the machine reading of biochemical events expressed in biomedical texts is correctly reading the polarity, i.e., attributing whether the biochemical event is a promotion or an inhibition. Here we present a novel dataset for studying polarity attribution accuracy. We use this dataset to train and evaluate several deep learning models for polarity identification, and compare these to a linguistically-informed model. The best performing deep learning architecture achieves 0.968 average F1 performance in a five-fold cross-validation study, a considerable improvement over the linguistically informed model average F1 of 0.862.


pdf bib
Learning what to read: Focused machine reading
Enrique Noriega-Atala | Marco A. Valenzuela-Escárcega | Clayton Morrison | Mihai Surdeanu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Recent efforts in bioinformatics have achieved tremendous progress in the machine reading of biomedical literature, and the assembly of the extracted biochemical interactions into large-scale models such as protein signaling pathways. However, batch machine reading of literature at today’s scale (PubMed alone indexes over 1 million papers per year) is unfeasible due to both cost and processing overhead. In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. We introduce a family of algorithms for focused reading, including an intuitive, strong baseline, and a second approach which uses a reinforcement learning (RL) framework that learns when to explore (widen the search) or exploit (narrow it). We demonstrate that the RL approach is capable of answering more queries than the baseline, while being more efficient, i.e., reading fewer documents.