Marten van Schijndel

Also published as: Marten Van Schijndel, Martin van Schijndel


2020

pdf bib
Discourse structure interacts with reference but not syntax in neural language models
Forrest Davis | Marten van Schijndel
Proceedings of the 24th Conference on Computational Natural Language Learning

Language models (LMs) trained on large quantities of text have been claimed to acquire abstract linguistic representations. Our work tests the robustness of these abstractions by focusing on the ability of LMs to learn interactions between different linguistic representations. In particular, we utilized stimuli from psycholinguistic studies showing that humans can condition reference (i.e. coreference resolution) and syntactic processing on the same discourse structure (implicit causality). We compared both transformer and long short-term memory LMs to find that, contrary to humans, implicit causality only influences LM behavior for reference, not syntax, despite model representations that encode the necessary discourse information. Our results further suggest that LM behavior can contradict not only learned representations of discourse but also syntactic agreement, pointing to shortcomings of standard language modeling.

pdf bib
Filler-gaps that neural networks fail to generalize
Debasmita Bhattacharya | Marten van Schijndel
Proceedings of the 24th Conference on Computational Natural Language Learning

It can be difficult to separate abstract linguistic knowledge in recurrent neural networks (RNNs) from surface heuristics. In this work, we probe for highly abstract syntactic constraints that have been claimed to govern the behavior of filler-gap dependencies across different surface constructions. For models to generalize abstract patterns in expected ways to unseen data, they must share representational features in predictable ways. We use cumulative priming to test for representational overlap between disparate filler-gap constructions in English and find evidence that the models learn a general representation for the existence of filler-gap dependencies. However, we find no evidence that the models learn any of the shared underlying grammatical constraints we tested. Our work raises questions about the degree to which RNN language models learn abstract linguistic representations.

pdf bib
Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment
Forrest Davis | Marten van Schijndel
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

A standard approach to evaluating language models analyzes how models assign probabilities to valid versus invalid syntactic constructions (i.e. is a grammatical sentence more probable than an ungrammatical sentence). Our work uses ambiguous relative clause attachment to extend such evaluations to cases of multiple simultaneous valid interpretations, where stark grammaticality differences are absent. We compare model performance in English and Spanish to show that non-linguistic biases in RNN LMs advantageously overlap with syntactic structure in English but not Spanish. Thus, English models may appear to acquire human-like syntactic preferences, while models trained on Spanish fail to acquire comparable human-like preferences. We conclude by relating these results to broader concerns about the relationship between comprehension (i.e. typical language model use cases) and production (which generates the training data for language models), suggesting that necessary linguistic biases are not present in the training signal at all.

2019

pdf bib
Quantity doesn’t buy quality syntax with neural language models
Marten van Schijndel | Aaron Mueller | Tal Linzen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recurrent neural networks can learn to predict upcoming words remarkably well on average; in syntactically complex contexts, however, they often assign unexpectedly high probabilities to ungrammatical words. We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained. We find that gains from increasing network size are minimal beyond a certain point. Likewise, expanding the training corpus yields diminishing returns; we estimate that the training corpus would need to be unrealistically large for the models to match human performance. A comparison to GPT and BERT, Transformer-based models trained on billions of words, reveals that these models perform even more poorly than our LSTMs in some constructions. Our results make the case for more data efficient architectures.

pdf bib
Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models
Grusha Prasad | Marten van Schijndel | Tal Linzen
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Neural language models (LMs) perform well on tasks that require sensitivity to syntactic structure. Drawing on the syntactic priming paradigm from psycholinguistics, we propose a novel technique to analyze the representations that enable such success. By establishing a gradient similarity metric between structures, this technique allows us to reconstruct the organization of the LMs’ syntactic representational space. We use this technique to demonstrate that LSTM LMs’ representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence.

pdf bib
Can Entropy Explain Successor Surprisal Effects in Reading?
Marten van Schijndel | Tal Linzen
Proceedings of the Society for Computation in Linguistics (SCiL) 2019

2018

pdf bib
A Neural Model of Adaptation in Reading
Marten van Schijndel | Tal Linzen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

It has been argued that humans rapidly adapt their lexical and syntactic expectations to match the statistics of the current linguistic context. We provide further support to this claim by showing that the addition of a simple adaptation mechanism to a neural language model improves our predictions of human reading times compared to a non-adaptive model. We analyze the performance of the model on controlled materials from psycholinguistic experiments and show that it adapts not only to lexical items but also to abstract syntactic structures.

pdf bib
Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018)
Asad Sayeed | Cassandra Jacobs | Tal Linzen | Marten van Schijndel
Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2018)

2017

pdf bib
Proceedings of the 7th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2017)
Ted Gibson | Tal Linzen | Asad Sayeed | Martin van Schijndel | William Schuler
Proceedings of the 7th Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2017)

2016

pdf bib
Addressing surprisal deficiencies in reading time models
Marten van Schijndel | William Schuler
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

This study demonstrates a weakness in how n-gram and PCFG surprisal are used to predict reading times in eye-tracking data. In particular, the information conveyed by words skipped during saccades is not usually included in the surprisal measures. This study shows that correcting the surprisal calculation improves n-gram surprisal and that upcoming n-grams affect reading times, replicating previous findings of how lexical frequencies affect reading times. In contrast, the predictivity of PCFG surprisal does not benefit from the surprisal correction despite the fact that lexical sequences skipped by saccades are processed by readers, as demonstrated by the corrected n-gram measure. These results raise questions about the formulation of information-theoretic measures of syntactic processing such as PCFG surprisal and entropy reduction when applied to reading times.

pdf bib
Memory access during incremental sentence processing causes reading time latency
Cory Shain | Marten van Schijndel | Richard Futrell | Edward Gibson | William Schuler
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

Studies on the role of memory as a predictor of reading time latencies (1) differ in their predictions about when memory effects should occur in processing and (2) have had mixed results, with strong positive effects emerging from isolated constructed stimuli and weak or even negative effects emerging from naturally-occurring stimuli. Our study addresses these concerns by comparing several implementations of prominent sentence processing theories on an exploratory corpus and evaluating the most successful of these on a confirmatory corpus, using a new self-paced reading corpus of seemingly natural narratives constructed to contain an unusually high proportion of memory-intensive constructions. We show highly significant and complementary broad-coverage latency effects both for predictors based on the Dependency Locality Theory and for predictors based on a left-corner parsing model of sentence processing. Our results indicate that memory access during sentence processing does take time, but suggest that stimuli requiring many memory access events may be necessary in order to observe the effect.

2015

pdf bib
Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics
Tim O’Donnell | Marten van Schijndel
Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
Evidence of syntactic working memory usage in MEG data
Marten van Schijndel | Brian Murphy | William Schuler
Proceedings of the 6th Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
Hierarchic syntax improves reading time prediction
Marten van Schijndel | William Schuler
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
AZMAT: Sentence Similarity Using Associative Matrices
Evan Jaffe | Lifeng Jin | David King | Marten van Schijndel
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Bootstrapping into Filler-Gap: An Acquisition Story
Marten van Schijndel | Micha Elsner
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
An Analysis of Memory-based Processing Costs using Incremental Deep Syntactic Dependency Parsing
Marten van Schijndel | Luan Nguyen | William Schuler
Proceedings of the Fourth Annual Workshop on Cognitive Modeling and Computational Linguistics (CMCL)

pdf bib
An Analysis of Frequency- and Memory-Based Processing Costs
Marten van Schijndel | William Schuler
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
Connectionist-Inspired Incremental PCFG Parsing
Marten van Schijndel | Andy Exley | William Schuler
Proceedings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2012)

pdf bib
Accurate Unbounded Dependency Recovery using Generalized Categorial Grammars
Luan Nguyen | Marten Van Schijndel | William Schuler
Proceedings of COLING 2012