James Mullenbach


2019

pdf bib
Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects
James Mullenbach | Jonathan Gordon | Nanyun Peng | Jonathan May
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

How do adjectives project from a noun to its parts? If a motorcycle is red, are its wheels red? Is a nuclear submarine’s captain nuclear? These questions are easy for humans to judge using our commonsense understanding of the world, but are difficult for computers. To attack this challenge, we crowdsource a set of human judgments that answer the English-language question “Given a whole described by an adjective, does the adjective also describe a given part?” We build strong baselines for this task with a classification approach. Our findings indicate that, despite the recent successes of large language models on tasks aimed to assess commonsense knowledge, these models do not greatly outperform simple word-level models based on pre-trained word embeddings. This provides evidence that the amount of commonsense knowledge encoded in these language models does not extend far beyond that already baked into the word embeddings. Our dataset will serve as a useful testbed for future research in commonsense reasoning, especially as it relates to adjectives and objects

2018

pdf bib
Explainable Prediction of Medical Codes from Clinical Text
James Mullenbach | Sarah Wiegreffe | Jon Duke | Jimeng Sun | Jacob Eisenstein
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Clinical notes are text documents that are created by clinicians for each patient encounter. They are typically accompanied by medical codes, which describe the diagnosis and treatment. Annotating these codes is labor intensive and error prone; furthermore, the connection between the codes and the text is not annotated, obscuring the reasons and details behind specific diagnoses and treatments. We present an attentional convolutional network that predicts medical codes from clinical text. Our method aggregates information across the document using a convolutional neural network, and uses an attention mechanism to select the most relevant segments for each of the thousands of possible codes. The method is accurate, achieving precision@8 of 0.71 and a Micro-F1 of 0.54, which are both better than the prior state of the art. Furthermore, through an interpretability evaluation by a physician, we show that the attention mechanism identifies meaningful explanations for each code assignment.