Frank Rudzicz


2020

pdf bib
Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP
John Chen | Ian Berlot-Attwell | Xindi Wang | Safwan Hossain | Frank Rudzicz
Proceedings of the 3rd Clinical Natural Language Processing Workshop

Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as free text. We propose a novel task of exploring fairness on a multimodal clinical dataset, adopting equalized odds for the downstream medical prediction tasks. To this end, we investigate a modality-agnostic fairness algorithm - equalized odds post processing - and compare it to a text-specific fairness algorithm: debiased clinical word embeddings. Despite the fact that debiased word embeddings do not explicitly address equalized odds of protected groups, we show that a text-specific approach to fairness may simultaneously achieve a good balance of performance classical notions of fairness. Our work opens the door for future work at the critical intersection of clinical NLP and fairness.

pdf bib
Identification of Primary and Collateral Tracks in Stuttered Speech
Rachid Riad | Anne-Catherine Bachoud-Lévi | Frank Rudzicz | Emmanuel Dupoux
Proceedings of the 12th Language Resources and Evaluation Conference

Disfluent speech has been previously addressed from two main perspectives: the clinical perspective focusing on diagnostic, and the Natural Language Processing (NLP) perspective aiming at modeling these events and detect them for downstream tasks. In addition, previous works often used different metrics depending on whether the input features are text or speech, making it difficult to compare the different contributions. Here, we introduce a new evaluation framework for disfluency detection inspired by the clinical and NLP perspective together with the theory of performance from (Clark, 1996) which distinguishes between primary and collateral tracks. We introduce a novel forced-aligned disfluency dataset from a corpus of semi-directed interviews, and present baseline results directly comparing the performance of text-based features (word and span information) and speech-based (acoustic-prosodic information). Finally, we introduce new audio features inspired by the word-based span features. We show experimentally that using these features outperformed the baselines for speech-based predictions on the present dataset.

pdf bib
Word class flexibility: A deep contextualized approach
Bai Li | Guillaume Thomas | Yang Xu | Frank Rudzicz
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Word class flexibility refers to the phenomenon whereby a single word form is used across different grammatical categories. Extensive work in linguistic typology has sought to characterize word class flexibility across languages, but quantifying this phenomenon accurately and at scale has been fraught with difficulties. We propose a principled methodology to explore regularity in word class flexibility. Our method builds on recent work in contextualized word embeddings to quantify semantic shift between word classes (e.g., noun-to-verb, verb-to-noun), and we apply this method to 37 languages. We find that contextualized embeddings not only capture human judgment of class variation within words in English, but also uncover shared tendencies in class flexibility across languages. Specifically, we find greater semantic variation when flexible lemmas are used in their dominant word class, supporting the view that word class flexibility is a directional process. Our work highlights the utility of deep contextualized models in linguistic typology.

pdf bib
Explainable Clinical Decision Support from Text
Jinyue Feng | Chantal Shaib | Frank Rudzicz
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Clinical prediction models often use structured variables and provide outcomes that are not readily interpretable by clinicians. Further, free-text medical notes may contain information not immediately available in structured variables. We propose a hierarchical CNN-transformer model with explicit attention as an interpretable, multi-task clinical language model, which achieves an AUROC of 0.75 and 0.78 on sepsis and mortality prediction, respectively. We also explore the relationships between learned features from structured and unstructured variables using projection-weighted canonical correlation analysis. Finally, we outline a protocol to evaluate model usability in a clinical decision support context. From domain-expert evaluations, our model generates informative rationales that have promising real-life applications.

pdf bib
On Losses for Modern Language Models
Stéphane Aroca-Ouellette | Frank Rudzicz
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

BERT set many state-of-the-art results over varied NLU benchmarks by pre-training over two tasks: masked language modelling (MLM) and next sentence prediction (NSP), the latter of which has been highly criticized. In this paper, we 1) clarify NSP’s effect on BERT pre-training, 2) explore fourteen possible auxiliary pre-training tasks, of which seven are novel to modern language models, and 3) investigate different ways to include multiple tasks into pre-training. We show that NSP is detrimental to training due to its context splitting and shallow semantic signal. We also identify six auxiliary pre-training tasks – sentence ordering, adjacent sentence prediction, TF prediction, TF-IDF prediction, a FastSent variant, and a Quick Thoughts variant – that outperform a pure MLM baseline. Finally, we demonstrate that using multiple tasks in a multi-task pre-training framework provides better results than using any single auxiliary task. Using these methods, we outperform BERTBase on the GLUE benchmark using fewer than a quarter of the training tokens.

pdf bib
An information theoretic view on selecting linguistic probes
Zining Zhu | Frank Rudzicz
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

There is increasing interest in assessing the linguistic knowledge encoded in neural representations. A popular approach is to attach a diagnostic classifier – or ”probe” – to perform supervised classification from internal representations. However, how to select a good probe is in debate. Hewitt and Liang (2019) showed that a high performance on diagnostic classification itself is insufficient, because it can be attributed to either ”the representation being rich in knowledge”, or ”the probe learning the task”, which Pimentel et al. (2020) challenged. We show this dichotomy is valid information-theoretically. In addition, we find that the ”good probe” criteria proposed by the two papers, *selectivity* (Hewitt and Liang, 2019) and *information gain* (Pimentel et al., 2020), are equivalent – the errors of their approaches are identical (modulo irrelevant terms). Empirically, these two selection criteria lead to results that highly agree with each other.

pdf bib
Examining the rhetorical capacities of neural language models
Zining Zhu | Chuer Pan | Mohamed Abdalla | Frank Rudzicz
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Recently, neural language models (LMs) have demonstrated impressive abilities in generating high-quality discourse. While many recent papers have analyzed the syntactic aspects encoded in LMs, there has been no analysis to date of the inter-sentential, rhetorical knowledge. In this paper, we propose a method that quantitatively evaluates the rhetorical capacities of neural LMs. We examine the capacities of neural LMs understanding the rhetoric of discourse by evaluating their abilities to encode a set of linguistic features derived from Rhetorical Structure Theory (RST). Our experiments show that BERT-based LMs outperform other Transformer LMs, revealing the richer discourse knowledge in their intermediate layer representations. In addition, GPT-2 and XLNet apparently encode less rhetorical knowledge, and we suggest an explanation drawing from linguistic philosophy. Our method shows an avenue towards quantifying the rhetorical capacities of neural LMs.

pdf bib
Representation Learning for Discovering Phonemic Tone Contours
Bai Li | Jing Yi Xie | Frank Rudzicz
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

Tone is a prosodic feature used to distinguish words in many languages, some of which are endangered and scarcely documented. In this work, we use unsupervised representation learning to identify probable clusters of syllables that share the same phonemic tone. Our method extracts the pitch for each syllable, then trains a convolutional autoencoder to learn a low-dimensional representation for each contour. We then apply the mean shift algorithm to cluster tones in high-density regions of the latent space. Furthermore, by feeding the centers of each cluster into the decoder, we produce a prototypical contour that represents each cluster. We apply this method to spoken multi-syllable words in Mandarin Chinese and Cantonese and evaluate how closely our clusters match the ground truth tone categories. Finally, we discuss some difficulties with our approach, including contextual tone variation and allophony effects.

2019

pdf bib
Lexical Features Are More Vulnerable, Syntactic Features Have More Predictive Power
Jekaterina Novikova | Aparna Balagopalan | Ksenia Shkaruta | Frank Rudzicz
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Understanding the vulnerability of linguistic features extracted from noisy text is important for both developing better health text classification models and for interpreting vulnerabilities of natural language models. In this paper, we investigate how generic language characteristics, such as syntax or the lexicon, are impacted by artificial text alterations. The vulnerability of features is analysed from two perspectives: (1) the level of feature value change, and (2) the level of change of feature predictive power as a result of text modifications. We show that lexical features are more sensitive to text modifications than syntactic ones. However, we also demonstrate that these smaller changes of syntactic features have a stronger influence on classification performance downstream, compared to the impact of changes to lexical features. Results are validated across three datasets representing different text-classification tasks, with different levels of lexical and syntactic complexity of both conversational and written language.

pdf bib
Extracting relevant information from physician-patient dialogues for automated clinical note taking
Serena Jeblee | Faiza Khan Khattak | Noah Crampton | Muhammad Mamdani | Frank Rudzicz
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

We present a system for automatically extracting pertinent medical information from dialogues between clinicians and patients. The system parses each dialogue and extracts entities such as medications and symptoms, using context to predict which entities are relevant. We also classify the primary diagnosis for each conversation. In addition, we extract topic information and identify relevant utterances. This serves as a baseline for a system that extracts information from dialogues and automatically generates a patient note, which can be reviewed and edited by the clinician.

pdf bib
How do we feel when a robot dies? Emotions expressed on Twitter before and after hitchBOT’s destruction
Kathleen C. Fraser | Frauke Zeller | David Harris Smith | Saif Mohammad | Frank Rudzicz
Proceedings of the Tenth Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In 2014, a chatty but immobile robot called hitchBOT set out to hitchhike across Canada. It similarly made its way across Germany and the Netherlands, and had begun a trip across the USA when it was destroyed by vandals. In this work, we analyze the emotions and sentiments associated with words in tweets posted before and after hitchBOT’s destruction to answer two questions: Were there any differences in the emotions expressed across the different countries visited by hitchBOT? And how did the public react to the demise of hitchBOT? Our analyses indicate that while there were few cross-cultural differences in sentiment towards hitchBOT, there was a significant negative emotional reaction to its destruction, suggesting that people had formed an emotional connection with hitchBOT and perceived its destruction as morally wrong. We discuss potential implications of anthropomorphism and emotional attachment to robots from the perspective of robot ethics.

pdf bib
Predicting ICU transfers using text messages between nurses and doctors
Faiza Khan Khattak | Chloé Pou-Prom | Robert Wu | Frank Rudzicz
Proceedings of the 2nd Clinical Natural Language Processing Workshop

We explore the use of real-time clinical information, i.e., text messages sent between nurses and doctors regarding patient conditions in order to predict transfer to the intensive care unit(ICU). Preliminary results, in data from five hospitals, indicate that, despite being short and full of noise, text messages can augment other visit information to improve the performance of ICU transfer prediction.

pdf bib
Generative Adversarial Networks for Text Using Word2vec Intermediaries
Akshay Budhkar | Krishnapriya Vishnubhotla | Safwan Hossain | Frank Rudzicz
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Generative adversarial networks (GANs) have shown considerable success, especially in the realistic generation of images. In this work, we apply similar techniques for the generation of text. We propose a novel approach to handle the discrete nature of text, during training, using word embeddings. Our method is agnostic to vocabulary size and achieves competitive results relative to methods with various discrete gradient estimators.

pdf bib
Detecting cognitive impairments by agreeing on interpretations of linguistic features
Zining Zhu | Jekaterina Novikova | Frank Rudzicz
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Linguistic features have shown promising applications for detecting various cognitive impairments. To improve detection accuracies, increasing the amount of data or the number of linguistic features have been two applicable approaches. However, acquiring additional clinical data can be expensive, and hand-crafting features is burdensome. In this paper, we take a third approach, proposing Consensus Networks (CNs), a framework to classify after reaching agreements between modalities. We divide linguistic features into non-overlapping subsets according to their modalities, and let neural networks learn low-dimensional representations that agree with each other. These representations are passed into a classifier network. All neural networks are optimized iteratively. In this paper, we also present two methods that improve the performance of CNs. We then present ablation studies to illustrate the effectiveness of modality division. To understand further what happens in CNs, we visualize the representations during training. Overall, using all of the 413 linguistic features, our models significantly outperform traditional classifiers, which are used by the state-of-the-art papers.

pdf bib
Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus
Bai Li | Yi-Te Hsu | Frank Rudzicz
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Machine learning has shown promise for automatic detection of Alzheimer’s disease (AD) through speech; however, efforts are hampered by a scarcity of data, especially in languages other than English. We propose a method to learn a correspondence between independently engineered lexicosyntactic features in two languages, using a large parallel corpus of out-of-domain movie dialogue data. We apply it to dementia detection in Mandarin Chinese, and demonstrate that our method outperforms both unilingual and machine translation-based baselines. This appears to be the first study that transfers feature domains in detecting cognitive decline.

pdf bib
Multilingual prediction of Alzheimer’s disease through domain adaptation and concept-based language modelling
Kathleen C. Fraser | Nicklas Linz | Bai Li | Kristina Lundholm Fors | Frank Rudzicz | Alexandra König | Jan Alexandersson | Philippe Robert | Dimitrios Kokkinakis
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

There is growing evidence that changes in speech and language may be early markers of dementia, but much of the previous NLP work in this area has been limited by the size of the available datasets. Here, we compare several methods of domain adaptation to augment a small French dataset of picture descriptions (n = 57) with a much larger English dataset (n = 550), for the task of automatically distinguishing participants with dementia from controls. The first challenge is to identify a set of features that transfer across languages; in addition to previously used features based on information units, we introduce a new set of features to model the order in which information units are produced by dementia patients and controls. These concept-based language model features improve classification performance in both English and French separately, and the best result (AUC = 0.89) is achieved using the multilingual training set with a combination of information and language model features.

pdf bib
Augmenting word2vec with latent Dirichlet allocation within a clinical application
Akshay Budhkar | Frank Rudzicz
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

This paper presents three hybrid models that directly combine latent Dirichlet allocation and word embedding for distinguishing between speakers with and without Alzheimer’s disease from transcripts of picture descriptions. Two of our models get F-scores over the current state-of-the-art using automatic methods on the DementiaBank dataset.

2018

pdf bib
Learning multiview embeddings for assessing dementia
Chloé Pou-Prom | Frank Rudzicz
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

As the incidence of Alzheimer’s Disease (AD) increases, early detection becomes crucial. Unfortunately, datasets for AD assessment are often sparse and incomplete. In this work, we leverage the multiview nature of a small AD dataset, DementiaBank, to learn an embedding that captures different modes of cognitive impairment. We apply generalized canonical correlation analysis (GCCA) to our dataset and demonstrate the added benefit of using multiview embeddings in two downstream tasks: identifying AD and predicting clinical scores. By including multiview embeddings, we obtain an F1 score of 0.82 in the classification task and a mean absolute error of 3.42 in the regression task. Furthermore, we show that multiview embeddings can be obtained from other datasets as well.

2017

pdf bib
Identifying and Avoiding Confusion in Dialogue with People with Alzheimer’s Disease
Hamidreza Chinaei | Leila Chan Currie | Andrew Danks | Hubert Lin | Tejas Mehta | Frank Rudzicz
Computational Linguistics, Volume 43, Issue 2 - June 2017

Alzheimer’s disease (AD) is an increasingly prevalent cognitive disorder in which memory, language, and executive function deteriorate, usually in that order. There is a growing need to support individuals with AD and other forms of dementia in their daily lives, and our goal is to do so through speech-based interaction. Given that 33% of conversations with people with middle-stage AD involve a breakdown in communication, it is vital that automated dialogue systems be able to identify those breakdowns and, if possible, avoid them. In this article, we discuss several linguistic features that are verbal indicators of confusion in AD (including vocabulary richness, parse tree structures, and acoustic cues) and apply several machine learning algorithms to identify dialogue-relevant confusion from speech with up to 82% accuracy. We also learn dialogue strategies to avoid confusion in the first place, which is accomplished using a partially observable Markov decision process and which obtains accuracies (up to 96.1%) that are significantly higher than several baselines. This work represents a major step towards automated dialogue systems for individuals with dementia.

pdf bib
Detecting Anxiety through Reddit
Judy Hanwen Shen | Frank Rudzicz
Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology — From Linguistic Signal to Clinical Reality

Previous investigations into detecting mental illnesses through social media have predominately focused on detecting depression through Twitter corpora. In this paper, we study anxiety disorders through personal narratives collected through the popular social media website, Reddit. We build a substantial data set of typical and anxiety-related posts, and we apply N-gram language modeling, vector embeddings, topic analysis, and emotional norms to generate features that accurately classify posts related to binary levels of anxiety. We achieve an accuracy of 91% with vector-space word embeddings, and an accuracy of 98% when combined with lexicon-based features.

2016

pdf bib
Vector-space topic models for detecting Alzheimer’s disease
Maria Yancheva | Frank Rudzicz
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Detecting late-life depression in Alzheimer’s disease through analysis of speech and language
Kathleen C. Fraser | Frank Rudzicz | Graeme Hirst
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

2015

pdf bib
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies
Jan Alexandersson | Ercan Altinsoy | Heidi Christensen | Peter Ljunglöf | François Portet | Frank Rudzicz
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Automatic dysfluency detection in dysarthric speech using deep belief networks
Stacey Oue | Ricard Marxer | Frank Rudzicz
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Remote Speech Technology for Speech Professionals - the CloudCAST initiative
Phil Green | Ricard Marxer | Stuart Cunningham | Heidi Christensen | Frank Rudzicz | Maria Yancheva | André Coy | Massimuliano Malavasi | Lorenzo Desideri
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Using linguistic features longitudinally to predict clinical scores for Alzheimer’s disease and related dementias
Maria Yancheva | Kathleen Fraser | Frank Rudzicz
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies

2014

pdf bib
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies
Jan Alexandersson | Dimitra Anastasiou | Cui Jian | Ani Nenkova | Rupal Patel | Frank Rudzicz | Annalu Waller | Desislava Zhekova
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Speech recognition in Alzheimer’s disease with personal assistive robots
Frank Rudzicz | Rosalie Wang | Momotaz Begum | Alex Mihailidis
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies

2013

pdf bib
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies
Jan Alexandersson | Peter Ljunglöf | Kathleen F. McCoy | François Portet | Brian Roark | Frank Rudzicz | Michel Vacher
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Automatic speech recognition in the diagnosis of primary progressive aphasia
Kathleen Fraser | Frank Rudzicz | Naida Graham | Elizabeth Rochon
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Automatic detection of deception in child-produced speech using syntactic complexity features
Maria Yancheva | Frank Rudzicz
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Communication strategies for a computerized caregiver for individuals with Alzheimer’s disease
Frank Rudzicz | Rozanne Wilson | Alex Mihailidis | Elizabeth Rochon | Carol Leonard
Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies

2011

pdf bib
Acoustic transformations to improve the intelligibility of dysarthric speech
Frank Rudzicz
Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies

2010

pdf bib
Proceedings of the NAACL HLT 2010 Student Research Workshop
Julia Hockenmaier | Diane Litman | Adriane Boyd | Mahesh Joshi | Frank Rudzicz
Proceedings of the NAACL HLT 2010 Student Research Workshop

pdf bib
Towards a noisy-channel model of dysarthria in speech recognition
Frank Rudzicz
Proceedings of the NAACL HLT 2010 Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Correcting Errors in Speech Recognition with Articulatory Dynamics
Frank Rudzicz
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Summarizing multiple spoken documents: finding evidence from untranscribed audio
Xiaodan Zhu | Gerald Penn | Frank Rudzicz
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2006

pdf bib
Clavius: Bi-Directional Parsing for Generic Multimodal Interaction
Frank Rudzicz
Proceedings of the COLING/ACL 2006 Student Research Workshop