Nick Howell


2020

pdf bib
An Unsupervised Method for Weighting Finite-state Morphological Analyzers
Amr Keleg | Francis Tyers | Nick Howell | Tommi Pirinen
Proceedings of the 12th Language Resources and Evaluation Conference

Morphological analysis is one of the tasks that have been studied for years. Different techniques have been used to develop models for performing morphological analysis. Models based on finite state transducers have proved to be more suitable for languages with low available resources. In this paper, we have developed a method for weighting a morphological analyzer built using finite state transducers in order to disambiguate its results. The method is based on a word2vec model that is trained in a completely unsupervised way using raw untagged corpora and is able to capture the semantic meaning of the words. Most of the methods used for disambiguating the results of a morphological analyzer relied on having tagged corpora that need to manually built. Additionally, the method developed uses information about the token irrespective of its context unlike most of the other techniques that heavily rely on the word’s context to disambiguate its set of candidate analyses.

pdf bib
Effort-value payoff in lemmatisation for Uralic languages
Nick Howell | Maria Bibaeva | Francis M. Tyers
Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages

pdf bib
Language Models for Cloze Task Answer Generation in Russian
Anastasia Nikiforova | Sergey Pletenev | Daria Sinitsyna | Semen Sorokin | Anastasia Lopukhina | Nick Howell
Proceedings of the Second Workshop on Linguistic and Neurocognitive Resources

Linguistics predictability is the degree of confidence in which language unit (word, part of speech, etc.) will be the next in the sequence. Experiments have shown that the correct prediction simplifies the perception of a language unit and its integration into the context. As a result of an incorrect prediction, language processing slows down. Currently, to get a measure of the language unit predictability, a neurolinguistic experiment known as a cloze task has to be conducted on a large number of participants. Cloze tasks are resource-consuming and are criticized by some researchers as an insufficiently valid measure of predictability. In this paper, we compare different language models that attempt to simulate human respondents’ performance on the cloze task. Using a language model to create cloze task simulations would require significantly less time and conduct studies related to linguistic predictability.

2019

pdf bib
A biscriptual morphological transducer for Crimean Tatar
Francis M. Tyers | Jonathan Washington | Darya Kavitskaya | Memduh Gökırmak | Nick Howell | Remziye Berberova
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)