Jonathan Brennan


2020

pdf bib
The Alice Datasets: fMRI & EEG Observations of Natural Language Comprehension
Shohini Bhattasali | Jonathan Brennan | Wen-Ming Luh | Berta Franzluebbers | John Hale
Proceedings of the 12th Language Resources and Evaluation Conference

The Alice Datasets are a set of datasets based on magnetic resonance data and electrophysiological data, collected while participants heard a story in English. Along with the datasets and the text of the story, we provide a variety of different linguistic and computational measures ranging from prosodic predictors to predictors capturing hierarchical syntactic information. These ecologically valid datasets can be easily reused to replicate prior work and to test new hypotheses about natural language comprehension in the brain.

pdf bib
The Little Prince in 26 Languages: Towards a Multilingual Neuro-Cognitive Corpus
Sabrina Stehwien | Lena Henke | John Hale | Jonathan Brennan | Lars Meyer
Proceedings of the Second Workshop on Linguistic and Neurocognitive Resources

We present the Le Petit Prince Corpus (LPPC), a multi-lingual resource for research in (computational) psycho- and neurolinguistics. The corpus consists of the children’s story The Little Prince in 26 languages. The dataset is in the process of being built using state-of-the-art methods for speech and language processing and electroencephalography (EEG). The planned release of LPPC dataset will include raw text annotated with dependency graphs in the Universal Dependencies standard, a near-natural-sounding synthetic spoken subset as well as EEG recordings. We will use this corpus for conducting neurolinguistic studies that generalize across a wide range of languages, overcoming typological constraints to traditional approaches. The planned release of the LPPC combines linguistic and EEG data for many languages using fully automatic methods, and thus constitutes a readily extendable resource that supports cross-linguistic and cross-disciplinary research.

2019

pdf bib
Text Genre and Training Data Size in Human-like Parsing
John Hale | Adhiguna Kuncoro | Keith Hall | Chris Dyer | Jonathan Brennan
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Domain-specific training typically makes NLP systems work better. We show that this extends to cognitive modeling as well by relating the states of a neural phrase-structure parser to electrophysiological measures from human participants. These measures were recorded as participants listened to a spoken recitation of the same literary text that was supplied as input to the neural parser. Given more training data, the system derives a better cognitive model — but only when the training examples come from the same textual genre. This finding is consistent with the idea that humans adapt syntactic expectations to particular genres during language comprehension (Kaan and Chun, 2018; Branigan and Pickering, 2017).

2018

pdf bib
Finding syntax in human encephalography with beam search
John Hale | Chris Dyer | Adhiguna Kuncoro | Jonathan Brennan
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recurrent neural network grammars (RNNGs) are generative models of (tree , string ) pairs that rely on neural networks to evaluate derivational choices. Parsing with them using beam search yields a variety of incremental complexity metrics such as word surprisal and parser action count. When used as regressors against human electrophysiological responses to naturalistic text, they derive two amplitude effects: an early peak and a P600-like later peak. By contrast, a non-syntactic neural language model yields no reliable effects. Model comparisons attribute the early peak to syntactic composition within the RNNG. This pattern of results recommends the RNNG+beam search combination as a mechanistic model of the syntactic processing that occurs during normal human language comprehension.

pdf bib
Differentiating Phrase Structure Parsing and Memory Retrieval in the Brain
Shohini Bhattasali | John Hale | Christophe Pallier | Jonathan Brennan | Wen-Ming Luh | R. Nathan Spreng
Proceedings of the Society for Computation in Linguistics (SCiL) 2018

2016

pdf bib
Temporal Lobes as Combinatory Engines for both Form and Meaning
Jixing Li | Jonathan Brennan | Adam Mahar | John Hale
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

The relative contributions of meaning and form to sentence processing remains an outstanding issue across the language sciences. We examine this issue by formalizing four incremental complexity metrics and comparing them against freely-available ROI timecourses. Syntax-related metrics based on top-down parsing and structural dependency-distance turn out to significantly improve a regression model, compared to a simpler model that formalizes only conceptual combination using a distributional vector-space model. This confirms the view of the anterior temporal lobes as combinatory engines that deal in both form (see e.g. Brennan et al., 2012; Mazoyer, 1993) and meaning (see e.g., Patterson et al., 2007). This same characterization applies to a posterior temporal region in roughly “Wernicke’s Area.”

2015

pdf bib
Modeling fMRI time courses with linguistic structure at various grain sizes
John Hale | David Lutz | Wen-Ming Luh | Jonathan Brennan
Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics