Johannes Bjerva


2020

pdf bib
Zero-Shot Cross-Lingual Transfer with Meta Learning
Farhad Nooralahzadeh | Giannis Bekoulis | Johannes Bjerva | Isabelle Augenstein
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Learning what to share between tasks has become a topic of great importance, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as most languages in the world are under-resourced. Here, we consider the setting of training models on multiple different languages at the same time, when little or no data is available for languages other than English. We show that this challenging setup can be approached using meta-learning: in addition to training a source language model, another model learns to select which training instances are the most beneficial to the first. We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks (natural language inference, question answering). Our extensive experimental setup demonstrates the consistent effectiveness of meta-learning for a total of 15 languages. We improve upon the state-of-the-art for zero-shot and few-shot NLI (on MultiNLI and XNLI) and QA (on the MLQA dataset). A comprehensive error analysis indicates that the correlation of typological features between languages can partly explain when parameter sharing learned via meta-learning is beneficial.

pdf bib
SubjQA: A Dataset for Subjectivity and Review Comprehension
Johannes Bjerva | Nikita Bhutani | Behzad Golshan | Wang-Chiew Tan | Isabelle Augenstein
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and word-sense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answering (QA). We develop a new dataset which allows us to investigate this relationship. We find that subjectivity is an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance than found in previous work on sentiment analysis. For instance, a subjective question may or may not be associated with a subjective answer. We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 domains.

pdf bib
Unsupervised Evaluation for Question Answering with Transformers
Lukas Muttenthaler | Isabelle Augenstein | Johannes Bjerva
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

It is challenging to automatically evaluate the answer of a QA model at inference time. Although many models provide confidence scores, and simple heuristics can go a long way towards indicating answer correctness, such measures are heavily dataset-dependent and are unlikely to generalise. In this work, we begin by investigating the hidden representations of questions, answers, and contexts in transformer-based QA architectures. We observe a consistent pattern in the answer representations, which we show can be used to automatically evaluate whether or not a predicted answer span is correct. Our method does not require any labelled data and outperforms strong heuristic baselines, across 2 datasets and 7 domains. We are able to predict whether or not a model’s answer is correct with 91.37% accuracy on SQuAD, and 80.7% accuracy on SubjQA. We expect that this method will have broad applications, e.g., in semi-automatic development of QA datasets.

pdf bib
SIGTYP 2020 Shared Task: Prediction of Typological Features
Johannes Bjerva | Elizabeth Salesky | Sabrina J. Mielke | Aditi Chaudhary | Celano Giuseppe | Edoardo Maria Ponti | Ekaterina Vylomova | Ryan Cotterell | Isabelle Augenstein
Proceedings of the Second Workshop on Computational Research in Linguistic Typology

Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 2013) contain information about linguistic properties of the world’s languages. They have been shown to be useful for downstream applications, including cross-lingual transfer learning and linguistic probing. A major drawback hampering broader adoption of typological KBs is that they are sparsely populated, in the sense that most languages only have annotations for some features, and skewed, in that few features have wide coverage. As typological features often correlate with one another, it is possible to predict them and thus automatically populate typological KBs, which is also the focus of this shared task. Overall, the task attracted 8 submissions from 5 teams, out of which the most successful methods make use of such feature correlations. However, our error analysis reveals that even the strongest submitted systems struggle with predicting feature values for languages where few features are known.

2019

pdf bib
Transductive Auxiliary Task Self-Training for Neural Multi-Task Models
Johannes Bjerva | Katharina Kann | Isabelle Augenstein
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

Multi-task learning and self-training are two common ways to improve a machine learning model’s performance in settings with limited training data. Drawing heavily on ideas from those two approaches, we suggest transductive auxiliary task self-training: training a multi-task model on (i) a combination of main and auxiliary task training data, and (ii) test instances with auxiliary task labels which a single-task version of the model has previously generated. We perform extensive experiments on 86 combinations of languages and tasks. Our results are that, on average, transductive auxiliary task self-training improves absolute accuracy by up to 9.56% over the pure multi-task model for dependency relation tagging and by up to 13.03% for semantic tagging.

pdf bib
Uncovering Probabilistic Implications in Typological Knowledge Bases
Johannes Bjerva | Yova Kementchedjhieva | Ryan Cotterell | Isabelle Augenstein
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have postpositions. Uncovering such implications typically amounts to time-consuming manual processing by trained and experienced linguists, which potentially leaves key linguistic universals unexplored. In this paper, we present a computational model which successfully identifies known universals, including Greenberg universals, but also uncovers new ones, worthy of further linguistic investigation. Our approach outperforms baselines previously used for this problem, as well as a strong baseline from knowledge base population.

pdf bib
What Do Language Representations Really Represent?
Johannes Bjerva | Robert Östling | Maria Han Veiga | Jörg Tiedemann | Isabelle Augenstein
Computational Linguistics, Volume 45, Issue 2 - June 2019

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words end up with similar representations. If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations. We show that this holds even when the multilingual corpus has been translated into English, by picking up the faint signal left by the source languages. However, just as it is a thorny problem to separate semantic from syntactic similarity in word representations, it is not obvious what type of similarity is captured by language representations. We investigate correlations and causal relationships between language representations learned from translations on one hand, and genetic, geographical, and several levels of structural similarity between languages on the other. Of these, structural similarity is found to correlate most strongly with language representation similarity, whereas genetic relationships—a convenient benchmark used for evaluation in previous work—appears to be a confounding factor. Apart from implications about translation effects, we see this more generally as a case where NLP and linguistic typology can interact and benefit one another.

pdf bib
A Probabilistic Generative Model of Linguistic Typology
Johannes Bjerva | Yova Kementchedjhieva | Ryan Cotterell | Isabelle Augenstein
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features. The implied covariance between features inspires our probabilisation of this line of linguistic inquiry—we develop a generative model of language based on exponential-family matrix factorisation. By modelling all languages and features within the same architecture, we show how structural similarities between languages can be exploited to predict typological features with near-perfect accuracy, outperforming several baselines on the task of predicting held-out features. Furthermore, we show that language embeddings pre-trained on monolingual text allow for generalisation to unobserved languages. This finding has clear practical and also theoretical implications: the results confirm what linguists have hypothesised, i.e. that there are significant correlations between typological features and languages.

2018

pdf bib
KU-MTL at SemEval-2018 Task 1: Multi-task Identification of Affect in Tweets
Thomas Nyegaard-Signori | Casper Veistrup Helms | Johannes Bjerva | Isabelle Augenstein
Proceedings of The 12th International Workshop on Semantic Evaluation

We take a multi-task learning approach to the shared Task 1 at SemEval-2018. The general idea concerning the model structure is to use as little external data as possible in order to preserve the task relatedness and reduce complexity. We employ multi-task learning with hard parameter sharing to exploit the relatedness between sub-tasks. As a base model, we use a standard recurrent neural network for both the classification and regression subtasks. Our system ranks 32nd out of 48 participants with a Pearson score of 0.557 in the first subtask, and 20th out of 35 in the fifth subtask with an accuracy score of 0.464.

pdf bib
Copenhagen at CoNLLSIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding
Yova Kementchedjhieva | Johannes Bjerva | Isabelle Augenstein
Proceedings of the CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

pdf bib
From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings
Johannes Bjerva | Isabelle Augenstein
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS). Doing this manually is prohibitively time-consuming, which is in part evidenced by the fact that only 100 out of over 7,000 languages spoken in the world are fully covered in WALS. We learn distributed language representations, which can be used to predict typological properties on a massively multilingual scale. Additionally, quantitative and qualitative analyses of these language embeddings can tell us how language similarities are encoded in NLP models for tasks at different typological levels. The representations are learned in an unsupervised manner alongside tasks at three typological levels: phonology (grapheme-to-phoneme prediction, and phoneme reconstruction), morphology (morphological inflection), and syntax (part-of-speech tagging). We consider more than 800 languages and find significant differences in the language representations encoded, depending on the target task. For instance, although Norwegian Bokmål and Danish are typologically close to one another, they are phonologically distant, which is reflected in their language embeddings growing relatively distant in a phonological task. We are also able to predict typological features in WALS with high accuracies, even for unseen language families.

pdf bib
Parameter sharing between dependency parsers for related languages
Miryam de Lhoneux | Johannes Bjerva | Isabelle Augenstein | Anders Søgaard
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to better performance, but there is no consensus on what parameters to share. We present an evaluation of 27 different parameter sharing strategies across 10 languages, representing five pairs of related languages, each pair from a different language family. We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies. Based on this result, we propose an architecture where the transition classifier is shared, and the sharing of word and character parameters is controlled by a parameter that can be tuned on validation data. This model is linguistically motivated and obtains significant improvements over a monolingually trained baseline. We also find that sharing transition classifier parameters helps when training a parser on unrelated language pairs, but we find that, in the case of unrelated languages, sharing too many parameters does not help.

pdf bib
Tracking Typological Traits of Uralic Languages in Distributed Language Representations
Johannes Bjerva | Isabelle Augenstein
Proceedings of the Fourth International Workshop on Computational Linguistics of Uralic Languages

pdf bib
Cross-lingual complex word identification with multitask learning
Joachim Bingel | Johannes Bjerva
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We approach the 2018 Shared Task on Complex Word Identification by leveraging a cross-lingual multitask learning approach. Our method is highly language agnostic, as evidenced by the ability of our system to generalize across languages, including languages for which we have no training data. In the shared task, this is the case for French, for which our system achieves the best performance. We further provide a qualitative and quantitative analysis of which words pose problems for our system.

pdf bib
Character-level Supervision for Low-resource POS Tagging
Katharina Kann | Johannes Bjerva | Isabelle Augenstein | Barbara Plank | Anders Søgaard
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

Neural part-of-speech (POS) taggers are known to not perform well with little training data. As a step towards overcoming this problem, we present an architecture for learning more robust neural POS taggers by jointly training a hierarchical, recurrent model and a recurrent character-based sequence-to-sequence network supervised using an auxiliary objective. This way, we introduce stronger character-level supervision into the model, which enables better generalization to unseen words and provides regularization, making our encoding less prone to overfitting. We experiment with three auxiliary tasks: lemmatization, character-based word autoencoding, and character-based random string autoencoding. Experiments with minimal amounts of labeled data on 34 languages show that our new architecture outperforms a single-task baseline and, surprisingly, that, on average, raw text autoencoding can be as beneficial for low-resource POS tagging as using lemma information. Our neural POS tagger closes the gap to a state-of-the-art POS tagger (MarMoT) for low-resource scenarios by 43%, even outperforming it on languages with templatic morphology, e.g., Arabic, Hebrew, and Turkish, by some margin.

2017

pdf bib
SU-RUG at the CoNLL-SIGMORPHON 2017 shared task: Morphological Inflection with Attentional Sequence-to-Sequence Models
Robert Östling | Johannes Bjerva
Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection

pdf bib
The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations
Lasha Abzianidze | Johannes Bjerva | Kilian Evang | Hessel Haagsma | Rik van Noord | Pierre Ludmann | Duc-Duy Nguyen | Johan Bos
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.

pdf bib
ResSim at SemEval-2017 Task 1: Multilingual Word Representations for Semantic Textual Similarity
Johannes Bjerva | Robert Östling
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either in the same or in different languages. In our system submission, we employ multilingual word representations, in which similar words in different languages are close to one another. Using such representations is advantageous, since the increasing amount of available parallel data allows for the application of such methods to many of the languages in the world. Hence, semantic similarity can be inferred even for languages for which no annotated data exists. Our system is trained and evaluated on all language pairs included in the shared task (English, Spanish, Arabic, and Turkish). Although development results are promising, our system does not yield high performance on the shared task test sets.

pdf bib
Cross-lingual Learning of Semantic Textual Similarity with Multilingual Word Representations
Johannes Bjerva | Robert Östling
Proceedings of the 21st Nordic Conference on Computational Linguistics

pdf bib
Will my auxiliary tagging task help? Estimating Auxiliary Tasks Effectivity in Multi-Task Learning
Johannes Bjerva
Proceedings of the 21st Nordic Conference on Computational Linguistics

pdf bib
Neural Networks and Spelling Features for Native Language Identification
Johannes Bjerva | Gintarė Grigonytė | Robert Östling | Barbara Plank
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

We present the RUG-SU team’s submission at the Native Language Identification Shared Task 2017. We combine several approaches into an ensemble, based on spelling error features, a simple neural network using word representations, a deep residual network using word and character features, and a system based on a recurrent neural network. Our best system is an ensemble of neural networks, reaching an F1 score of 0.8323. Although our system is not the highest ranking one, we do outperform the baseline by far.

pdf bib
The Power of Character N-grams in Native Language Identification
Artur Kulmizev | Bo Blankers | Johannes Bjerva | Malvina Nissim | Gertjan van Noord | Barbara Plank | Martijn Wieling
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

In this paper, we explore the performance of a linear SVM trained on language independent character features for the NLI Shared Task 2017. Our basic system (GRONINGEN) achieves the best performance (87.56 F1-score) on the evaluation set using only 1-9 character n-grams as features. We compare this against several ensemble and meta-classifiers in order to examine how the linear system fares when combined with other, especially non-linear classifiers. Special emphasis is placed on the topic bias that exists by virtue of the assessment essay prompt distribution.

2016

pdf bib
Semantic Tagging with Deep Residual Networks
Johannes Bjerva | Barbara Plank | Johan Bos
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We propose a novel semantic tagging task, semtagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations, and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).

pdf bib
Detecting novel metaphor using selectional preference information
Hessel Haagsma | Johannes Bjerva
Proceedings of the Fourth Workshop on Metaphor in NLP

pdf bib
Morphological Complexity Influences Verb-Object Order in Swedish Sign Language
Johannes Bjerva | Carl Börstell
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

Computational linguistic approaches to sign languages could benefit from investigating how complexity influences structure. We investigate whether morphological complexity has an effect on the order of Verb (V) and Object (O) in Swedish Sign Language (SSL), on the basis of elicited data from five Deaf signers. We find a significant difference in the distribution of the orderings OV vs. VO, based on an analysis of morphological weight. While morphologically heavy verbs exhibit a general preference for OV, humanness seems to affect the ordering in the opposite direction, with [+human] Objects pushing towards a preference for VO.

pdf bib
Byte-based Language Identification with Deep Convolutional Networks
Johannes Bjerva
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)

We report on our system for the shared task on discriminating between similar languages (DSL 2016). The system uses only byte representations in a deep residual network (ResNet). The system, named ResIdent, is trained only on the data released with the task (closed training). We obtain 84.88% accuracy on subtask A, 68.80% accuracy on subtask B1, and 69.80% accuracy on subtask B2. A large difference in accuracy on development data can be observed with relatively minor changes in our network’s architecture and hyperparameters. We therefore expect fine-tuning of these parameters to yield higher accuracies.

pdf bib
The Meaning Factory at SemEval-2016 Task 8: Producing AMRs with Boxer
Johannes Bjerva | Johan Bos | Hessel Haagsma
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Word Embeddings Pointing the Way for Late Antiquity
Johannes Bjerva | Raf Praet
Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

2014

pdf bib
The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity
Johannes Bjerva | Johan Bos | Rob van der Goot | Malvina Nissim
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Multi-class Animacy Classification with Semantic Features
Johannes Bjerva
Proceedings of the Student Research Workshop at the 14th Conference of the European Chapter of the Association for Computational Linguistics