Bradley Hauer


2020

pdf bib
UAlberta at SemEval-2020 Task 2: Using Translations to Predict Cross-Lingual Entailment
Bradley Hauer | Amir Ahmad Habibi | Yixing Luan | Arnob Mallik | Grzegorz Kondrak
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We investigate the hypothesis that translations can be used to identify cross-lingual lexical entailment. We propose novel methods that leverage parallel corpora, word embeddings, and multilingual lexical resources. Our results demonstrate that the implementation of these ideas leads to improvements in predicting entailment.

pdf bib
Improving Word Sense Disambiguation with Translations
Yixing Luan | Bradley Hauer | Lili Mou | Grzegorz Kondrak
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

It has been conjectured that multilingual information can help monolingual word sense disambiguation (WSD). However, existing WSD systems rarely consider multilingual information, and no effective method has been proposed for improving WSD by generating translations. In this paper, we present a novel approach that improves the performance of a base WSD system using machine translation. Since our approach is language independent, we perform WSD experiments on several languages. The results demonstrate that our methods can consistently improve the performance of WSD systems, and obtain state-ofthe-art results in both English and multilingual WSD. To facilitate the use of lexical translation information, we also propose BABALIGN, an precise bitext alignment algorithm which is guided by multilingual lexical correspondences from BabelNet.

pdf bib
Low-Resource G2P and P2G Conversion with Synthetic Training Data
Bradley Hauer | Amir Ahmad Habibi | Yixing Luan | Arnob Mallik | Grzegorz Kondrak
Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the University of Alberta systems and results in the SIGMORPHON 2020 Task 1: Multilingual Grapheme-to-Phoneme Conversion. Following previous SIGMORPHON shared tasks, we define a low-resource setting with 100 training instances. We experiment with three transduction approaches in both standard and low-resource settings, as well as on the related task of phoneme-to-grapheme conversion. We propose a method for synthesizing training data using a combination of diverse models.

2019

pdf bib
Cognate Projection for Low-Resource Inflection Generation
Bradley Hauer | Amir Ahmad Habibi | Yixing Luan | Rashed Rubby Riyadh | Grzegorz Kondrak
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

We propose cognate projection as a method of crosslingual transfer for inflection generation in the context of the SIGMORPHON 2019 Shared Task. The results on four language pairs show the method is effective when no low-resource training data is available.

2018

pdf bib
Combining Neural and Non-Neural Methods for Low-Resource Morphological Reinflection
Saeed Najafi | Bradley Hauer | Rashed Rubby Riyadh | Leyuan Yu | Grzegorz Kondrak
Proceedings of the CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection

pdf bib
Comparison of Assorted Models for Transliteration
Saeed Najafi | Bradley Hauer | Rashed Rubby Riyadh | Leyuan Yu | Grzegorz Kondrak
Proceedings of the Seventh Named Entities Workshop

We report the results of our experiments in the context of the NEWS 2018 Shared Task on Transliteration. We focus on the comparison of several diverse systems, including three neural MT models. A combination of discriminative, generative, and neural models obtains the best results on the development sets. We also put forward ideas for improving the shared task.

2017

pdf bib
If you can’t beat them, join them: the University of Alberta system description
Garrett Nicolai | Bradley Hauer | Mohammad Motallebi | Saeed Najafi | Grzegorz Kondrak
Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection

pdf bib
Bootstrapping Unsupervised Bilingual Lexicon Induction
Bradley Hauer | Garrett Nicolai | Grzegorz Kondrak
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

The task of unsupervised lexicon induction is to find translation pairs across monolingual corpora. We develop a novel method that creates seed lexicons by identifying cognates in the vocabularies of related languages on the basis of their frequency and lexical similarity. We apply bidirectional bootstrapping to a method which learns a linear mapping between context-based vector spaces. Experimental results on three language pairs show consistent improvement over prior work.

2016

pdf bib
Morphological Reinflection via Discriminative String Transduction
Garrett Nicolai | Bradley Hauer | Adam St Arnaud | Grzegorz Kondrak
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

pdf bib
Decoding Anagrammed Texts Written in an Unknown Language and Script
Bradley Hauer | Grzegorz Kondrak
Transactions of the Association for Computational Linguistics, Volume 4

Algorithmic decipherment is a prime example of a truly unsupervised problem. The first step in the decipherment process is the identification of the encrypted language. We propose three methods for determining the source language of a document enciphered with a monoalphabetic substitution cipher. The best method achieves 97% accuracy on 380 languages. We then present an approach to decoding anagrammed substitution ciphers, in which the letters within words have been arbitrarily transposed. It obtains the average decryption word accuracy of 93% on a set of 50 ciphertexts in 5 languages. Finally, we report the results on the Voynich manuscript, an unsolved fifteenth century cipher, which suggest Hebrew as the language of the document.

2015

pdf bib
Multiple System Combination for Transliteration
Garrett Nicolai | Bradley Hauer | Mohammad Salameh | Adam St Arnaud | Ying Xu | Lei Yao | Grzegorz Kondrak
Proceedings of the Fifth Named Entity Workshop

2014

pdf bib
Solving Substitution Ciphers with Combined Language Models
Bradley Hauer | Ryan Hayward | Grzegorz Kondrak
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Cognate and Misspelling Features for Natural Language Identification
Garrett Nicolai | Bradley Hauer | Mohammad Salameh | Lei Yao | Grzegorz Kondrak
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Automatic Generation of English Respellings
Bradley Hauer | Grzegorz Kondrak
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2011

pdf bib
Leveraging Transliterations from Multiple Languages
Aditya Bhargava | Bradley Hauer | Grzegorz Kondrak
Proceedings of the 3rd Named Entities Workshop (NEWS 2011)

pdf bib
Clustering Semantically Equivalent Words into Cognate Sets in Multilingual Lists
Bradley Hauer | Grzegorz Kondrak
Proceedings of 5th International Joint Conference on Natural Language Processing