Dina Demner-Fushman


2020

pdf bib
Towards Zero-Shot Conditional Summarization with Adaptive Multi-Task Fine-Tuning
Travis Goodwin | Max Savery | Dina Demner-Fushman
Findings of the Association for Computational Linguistics: EMNLP 2020

Automatic summarization research has traditionally focused on providing high quality general-purpose summaries of documents. However, there are many applications which require more specific summaries, such as supporting question answering or topic-based literature discovery. In this paper we study the problem of conditional summarization in which content selection and surface realization are explicitly conditioned on an ad-hoc natural language question or topic description. Because of the difficulty in obtaining sufficient reference summaries to support arbitrary conditional summarization, we explore the use of multi-task fine-tuning (MTFT) on twenty-one natural language tasks to enable zero-shot conditional summarization on five tasks. We present four new summarization datasets, two novel “online” or adaptive task-mixing strategies, and report zero-shot performance using T5 and BART, demonstrating that MTFT can improve zero-shot summarization quality.

pdf bib
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

pdf bib
Enhancing Question Answering by Injecting Ontological Knowledge through Regularization
Travis Goodwin | Dina Demner-Fushman
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures

Deep neural networks have demonstrated high performance on many natural language processing (NLP) tasks that can be answered directly from text, and have struggled to solve NLP tasks requiring external (e.g., world) knowledge. In this paper, we present OSCR (Ontology-based Semantic Composition Regularization), a method for injecting task-agnostic knowledge from an Ontology or knowledge graph into a neural network during pre-training. We evaluated the performance of BERT pre-trained on Wikipedia with and without OSCR by measuring the performance when fine-tuning on two question answering tasks involving world knowledge and causal reasoning and one requiring domain (healthcare) knowledge and obtained 33.3%, 18.6%, and 4% improved accuracy compared to pre-training BERT without OSCR.

pdf bib
Flight of the PEGASUS? Comparing Transformers on Few-shot and Zero-shot Multi-document Abstractive Summarization
Travis Goodwin | Max Savery | Dina Demner-Fushman
Proceedings of the 28th International Conference on Computational Linguistics

Recent work has shown that pre-trained Transformers obtain remarkable performance on many natural language processing tasks including automatic summarization. However, most work has focused on (relatively) data-rich single-document summarization settings. In this paper, we explore highly-abstractive multi-document summarization where the summary is explicitly conditioned on a user-given topic statement or question. We compare the summarization quality produced by three state-of-the-art transformer-based models: BART, T5, and PEGASUS. We report the performance on four challenging summarization datasets: three from the general domain and one from consumer health in both zero-shot and few-shot learning settings. While prior work has shown significant differences in performance for these models on standard summarization tasks, our results indicate that with as few as 10 labeled examples there is no statistically significant difference in summary quality, suggesting the need for more abstractive benchmark collections when determining state-of-the-art.

pdf bib
HOLMS: Alternative Summary Evaluation with Large Language Models
Yassine Mrabet | Dina Demner-Fushman
Proceedings of the 28th International Conference on Computational Linguistics

Efficient document summarization requires evaluation measures that can not only rank a set of systems based on an average score, but also highlight which individual summary is better than another. However, despite the very active research on summarization approaches, few works have proposed new evaluation measures in the recent years. The standard measures relied upon for the development of summarization systems are most often ROUGE and BLEU which, despite being efficient in overall system ranking, remain lexical in nature and have a limited potential when it comes to training neural networks. In this paper, we present a new hybrid evaluation measure for summarization, called HOLMS, that combines both language models pre-trained on large corpora and lexical similarity measures. Through several experiments, we show that HOLMS outperforms ROUGE and BLEU substantially in its correlation with human judgments on several extractive summarization datasets for both linguistic quality and pyramid scores.

pdf bib
Visual Question Generation from Radiology Images
Mourad Sarrouti | Asma Ben Abacha | Dina Demner-Fushman
Proceedings of the First Workshop on Advances in Language and Vision Research

Visual Question Generation (VQG), the task of generating a question based on image contents, is an increasingly important area that combines natural language processing and computer vision. Although there are some recent works that have attempted to generate questions from images in the open domain, the task of VQG in the medical domain has not been explored so far. In this paper, we introduce an approach to generation of visual questions about radiology images called VQGR, i.e. an algorithm that is able to ask a question when shown an image. VQGR first generates new training data from the existing examples, based on contextual word embeddings and image augmentation techniques. It then uses the variational auto-encoders model to encode images into a latent space and decode natural language questions. Experimental automatic evaluations performed on the VQA-RAD dataset of clinical visual questions show that VQGR achieves good performances compared with the baseline system. The source code is available at https://github.com/sarrouti/vqgr.

2019

pdf bib
On the Summarization of Consumer Health Questions
Asma Ben Abacha | Dina Demner-Fushman
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Question understanding is one of the main challenges in question answering. In real world applications, users often submit natural language questions that are longer than needed and include peripheral information that increases the complexity of the question, leading to substantially more false positives in answer retrieval. In this paper, we study neural abstractive models for medical question summarization. We introduce the MeQSum corpus of 1,000 summarized consumer health questions. We explore data augmentation methods and evaluate state-of-the-art neural abstractive models on this new task. In particular, we show that semantic augmentation from question datasets improves the overall performance, and that pointer-generator networks outperform sequence-to-sequence attentional models on this task, with a ROUGE-1 score of 44.16%. We also present a detailed error analysis and discuss directions for improvement that are specific to question summarization.

pdf bib
Proceedings of the 18th BioNLP Workshop and Shared Task
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 18th BioNLP Workshop and Shared Task

pdf bib
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
Asma Ben Abacha | Chaitanya Shivade | Dina Demner-Fushman
Proceedings of the 18th BioNLP Workshop and Shared Task

This paper presents the MEDIQA 2019 shared task organized at the ACL-BioNLP workshop. The shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain, and their application to improve domain specific information retrieval and question answering systems. MEDIQA 2019 includes three tasks: Natural Language Inference (NLI), Recognizing Question Entailment (RQE), and Question Answering (QA) in the medical domain. 72 teams participated in the challenge, achieving an accuracy of 98% in the NLI task, 74.9% in the RQE task, and 78.3% in the QA task. In this paper, we describe the tasks, the datasets, and the participants’ approaches and results. We hope that this shared task will attract further research efforts in textual inference, question entailment, and question answering in the medical domain.

2018

pdf bib
Proceedings of the BioNLP 2018 workshop
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the BioNLP 2018 workshop

2017

pdf bib
TextFlow: A Text Similarity Measure based on Continuous Sequences
Yassine Mrabet | Halil Kilicoglu | Dina Demner-Fushman
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Text similarity measures are used in multiple tasks such as plagiarism detection, information ranking and recognition of paraphrases and textual entailment. While recent advances in deep learning highlighted the relevance of sequential models in natural language generation, existing similarity measures do not fully exploit the sequential nature of language. Examples of such similarity measures include n-grams and skip-grams overlap which rely on distinct slices of the input texts. In this paper we present a novel text similarity measure inspired from a common representation in DNA sequence alignment algorithms. The new measure, called TextFlow, represents input text pairs as continuous curves and uses both the actual position of the words and sequence matching to compute the similarity value. Our experiments on 8 different datasets show very encouraging results in paraphrase detection, textual entailment recognition and ranking relevance.

pdf bib
NLM_NIH at SemEval-2017 Task 3: from Question Entailment to Question Similarity for Community Question Answering
Asma Ben Abacha | Dina Demner-Fushman
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our participation in SemEval-2017 Task 3 on Community Question Answering (cQA). The Question Similarity subtask (B) aims to rank a set of related questions retrieved by a search engine according to their similarity to the original question. We adapted our feature-based system for Recognizing Question Entailment (RQE) to the question similarity task. Tested on cQA-B-2016 test data, our RQE system outperformed the best system of the 2016 challenge in all measures with 77.47 MAP and 80.57 Accuracy. On cQA-B-2017 test data, performances of all systems dropped by around 30 points. Our primary system obtained 44.62 MAP, 67.27 Accuracy and 47.25 F1 score. The cQA-B-2017 best system achieved 47.22 MAP and 42.37 F1 score. Our system is ranked sixth in terms of MAP and third in terms of F1 out of 13 participating teams.

pdf bib
BioNLP 2017
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Junichi Tsujii
BioNLP 2017

2016

pdf bib
Annotating Named Entities in Consumer Health Questions
Halil Kilicoglu | Asma Ben Abacha | Yassine Mrabet | Kirk Roberts | Laritza Rodriguez | Sonya Shooshan | Dina Demner-Fushman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe a corpus of consumer health questions annotated with named entities. The corpus consists of 1548 de-identified questions about diseases and drugs, written in English. We defined 15 broad categories of biomedical named entities for annotation. A pilot annotation phase in which a small portion of the corpus was double-annotated by four annotators was followed by a main phase in which double annotation was carried out by six annotators, and a reconciliation phase in which all annotations were reconciled by an expert. We conducted the annotation in two modes, manual and assisted, to assess the effect of automatic pre-annotation and calculated inter-annotator agreement. We obtained moderate inter-annotator agreement; assisted annotation yielded slightly better agreement and fewer missed annotations than manual annotation. Due to complex nature of biomedical entities, we paid particular attention to nested entities for which we obtained slightly lower inter-annotator agreement, confirming that annotating nested entities is somewhat more challenging. To our knowledge, the corpus is the first of its kind for consumer health text and is publicly available.

pdf bib
Annotating Logical Forms for EHR Questions
Kirk Roberts | Dina Demner-Fushman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task.

pdf bib
A Hybrid Approach to Generation of Missing Abstracts in Biomedical Literature
Suchet Chachra | Asma Ben Abacha | Sonya Shooshan | Laritza Rodriguez | Dina Demner-Fushman
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Readers usually rely on abstracts to identify relevant medical information from scientific articles. Abstracts are also essential to advanced information retrieval methods. More than 50 thousand scientific publications in PubMed lack author-generated abstracts, and the relevancy judgements for these papers have to be based on their titles alone. In this paper, we propose a hybrid summarization technique that aims to select the most pertinent sentences from articles to generate an extractive summary in lieu of a missing abstract. We combine i) health outcome detection, ii) keyphrase extraction, and iii) textual entailment recognition between sentences. We evaluate our hybrid approach and analyze the improvements of multi-factor summarization over techniques that rely on a single method, using a collection of 295 manually generated reference summaries. The obtained results show that the hybrid approach outperforms the baseline techniques with an improvement of 13% in recall and 4% in F1 score.

pdf bib
Proceedings of the 15th Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of the 15th Workshop on Biomedical Natural Language Processing

pdf bib
Using Learning-To-Rank to Enhance NLM Medical Text Indexer Results
Ilya Zavorin | James Mork | Dina Demner-Fushman
Proceedings of the Fourth BioASQ workshop

pdf bib
Aligning Texts and Knowledge Bases with Semantic Sentence Simplification
Yassine Mrabet | Pavlos Vougiouklis | Halil Kilicoglu | Claire Gardent | Dina Demner-Fushman | Jonathon Hare | Elena Simperl
Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG 2016)

pdf bib
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Sophia Ananiadou | Riza Batista-Navarro | Kevin Bretonnel Cohen | Dina Demner-Fushman | Paul Thompson
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)

2015

pdf bib
Proceedings of BioNLP 15
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 15

2014

pdf bib
Proceedings of BioNLP 2014
Kevin Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 2014

pdf bib
Decomposing Consumer Health Questions
Kirk Roberts | Halil Kilicoglu | Marcelo Fiszman | Dina Demner-Fushman
Proceedings of BioNLP 2014

pdf bib
Coreference Resolution for Structured Drug Product Labels
Halil Kilicoglu | Dina Demner-Fushman
Proceedings of BioNLP 2014

pdf bib
Annotating Question Decomposition on Complex Medical Questions
Kirk Roberts | Kate Masterton | Marcelo Fiszman | Halil Kilicoglu | Dina Demner-Fushman
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents a method for annotating question decomposition on complex medical questions. The annotations cover multiple syntactic ways that questions can be decomposed, including separating independent clauses as well as recognizing coordinations and exemplifications. We annotate a corpus of 1,467 multi-sentence consumer health questions about genetic and rare diseases. Furthermore, we label two additional medical-specific annotations: (1) background sentences are annotated with a number of medical categories such as symptoms, treatments, and family history, and (2) the central focus of the complex question (a disease) is marked. We present simple baseline results for automatic classification of these annotations, demonstrating the challenging but important nature of this task.

2013

pdf bib
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

pdf bib
Interpreting Consumer Health Questions: The Role of Anaphora and Ellipsis
Halil Kilicoglu | Marcelo Fiszman | Dina Demner-Fushman
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

2012

pdf bib
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Kevin B. Cohen | Dina Demner-Fushman | Sophia Ananiadou | Bonnie Webber | Jun’ichi Tsujii | John Pestian
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

pdf bib
Domain Adaptation of Coreference Resolution for Radiology Reports
Emilia Apostolova | Noriko Tomuro | Pattanasak Mongkolwat | Dina Demner-Fushman
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

2011

pdf bib
Proceedings of BioNLP 2011 Workshop
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of BioNLP 2011 Workshop

pdf bib
Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes
Emilia Apostolova | Noriko Tomuro | Dina Demner-Fushman
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
K. Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

2009

pdf bib
Proceedings of the BioNLP 2009 Workshop
K. Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the BioNLP 2009 Workshop

pdf bib
Using Non-Lexical Features to Identify Effective Indexing Terms for Biomedical Illustrations
Matthew Simpson | Dina Demner-Fushman | Charles Sneiderman | Sameer K. Antani | George R. Thoma
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Towards Automatic Image Region Annotation - Image Region Textual Coreference Resolution
Emilia Apostolova | Dina Demner-Fushman
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
Adapting Naturally Occurring Test Suites for Evaluation of Clinical Question Answering
Dina Demner-Fushman
Software Engineering, Testing, and Quality Assurance for Natural Language Processing

pdf bib
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Kevin Bretonnel Cohen | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

2007

pdf bib
Biological, translational, and clinical language processing
K. Bretonnel Cohen | Dina Demner-Fushman | Carol Friedman | Lynette Hirschman | John Pestian
Biological, translational, and clinical language processing

pdf bib
From indexing the biomedical literature to coding clinical text: experience with MTI and machine learning approaches
Alan R. Aronson | Olivier Bodenreider | Dina Demner-Fushman | Kin Wah Fung | Vivian K. Lee | James G. Mork | Aurélie Névéol | Lee Peters | Willie J. Rogers
Biological, translational, and clinical language processing

pdf bib
Interpreting comparative constructions in biomedical text
Marcelo Fiszman | Dina Demner-Fushman | Francois M. Lang | Philip Goetz | Thomas C. Rindflesch
Biological, translational, and clinical language processing

pdf bib
Answering Clinical Questions with Knowledge-Based and Statistical Techniques
Dina Demner-Fushman | Jimmy Lin
Computational Linguistics, Volume 33, Number 1, March 2007

2006

pdf bib
Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering
Dina Demner-Fushman | Jimmy Lin
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Situated Question Answering in the Clinical Domain: Selecting the Best Drug Treatment for Diseases
Dina Demner-Fushman | Jimmy Lin
Proceedings of the Workshop on Task-Focused Summarization and Question Answering

pdf bib
Generative Content Models for Structural Analysis of Medical Abstracts
Jimmy Lin | Damianos Karakos | Dina Demner-Fushman | Sanjeev Khudanpur
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology

pdf bib
Will Pyramids Built of Nuggets Topple Over?
Jimmy Lin | Dina Demner-Fushman
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

2005

pdf bib
Automatically Evaluating Answers to Definition Questions
Jimmy Lin | Dina Demner-Fushman
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Evaluating Summaries and Answers: Two Sides of the Same Coin?
Jimmy Lin | Dina Demner-Fushman
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization