Khalid Alnajjar


2020

pdf bib
On Editing Dictionaries for Uralic Languages in an Online Environment
Khalid Alnajjar | Mika Hämäläinen | Jack Rueter
Proceedings of the Sixth International Workshop on Computational Linguistics of Uralic Languages

pdf bib
Ve’rdd. Narrowing the Gap between Paper Dictionaries, Low-Resource NLP and Community Involvement
Khalid Alnajjar | Mika Hämäläinen | Jack Rueter | Niko Partanen
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

We present an open-source online dictionary editing system, Ve′rdd, that offers a chance to re-evaluate and edit grassroots dictionaries that have been exposed to multiple amateur editors. The idea is to incorporate community activities into a state-of-the-art finite-state language description of a seriously endangered minority language, Skolt Sami. Problems involve getting the community to take part in things above the pencil-and-paper level. At times, it seems that the native speakers and the dictionary oriented are lacking technical understanding to utilize the infrastructures which might make their work more meaningful in the future, i.e. multiple reuse of all of their input. Therefore, our system integrates with the existing tools and infrastructures for Uralic language masking the technical complexities behind a user-friendly UI.

2019

pdf bib
Generating Modern Poetry Automatically in Finnish
Mika Hämäläinen | Khalid Alnajjar
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We present a novel approach for generating poetry automatically for the morphologically rich Finnish language by using a genetic algorithm. The approach improves the state of the art of the previous Finnish poem generators by introducing a higher degree of freedom in terms of structural creativity. Our approach is evaluated and described within the paradigm of computational creativity, where the fitness functions of the genetic algorithm are assimilated with the notion of aesthetics. The output is considered to be a poem 81.5% of the time by human evaluators.

pdf bib
Dialect Text Normalization to Normative Standard Finnish
Niko Partanen | Mika Hämäläinen | Khalid Alnajjar
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish. As dialect is the common way of communication for people online in Finnish, such a normalization is a necessary step to improve the accuracy of the existing Finnish NLP tools that are tailored for normative Finnish text. We work on a corpus consisting of dialectal data of 23 distinct Finnish dialects. The best functioning BRNN approach lowers the initial word error rate of the corpus from 52.89 to 5.73.

pdf bib
Let’s FACE it. Finnish Poetry Generation with Aesthetics and Framing
Mika Hämäläinen | Khalid Alnajjar
Proceedings of the 12th International Conference on Natural Language Generation

We present a creative poem generator for the morphologically rich Finnish language. Our method falls into the master-apprentice paradigm, where a computationally creative genetic algorithm teaches a BRNN model to generate poetry. We model several parts of poetic aesthetics in the fitness function of the genetic algorithm, such as sonic features, semantic coherence, imagery and metaphor. Furthermore, we justify the creativity of our method based on the FACE theory on computational creativity and take additional care in evaluating our system by automatic metrics for concepts together with human evaluation for aesthetics, framing and expressions.

2018

pdf bib
A Master-Apprentice Approach to Automatic Creation of Culturally Satirical Movie Titles
Khalid Alnajjar | Mika Hämäläinen
Proceedings of the 11th International Conference on Natural Language Generation

Satire has played a role in indirectly expressing critique towards an authority or a person from time immemorial. We present an autonomously creative master-apprentice approach consisting of a genetic algorithm and an NMT model to produce humorous and culturally apt satire out of movie titles automatically. Furthermore, we evaluate the approach in terms of its creativity and its output. We provide a solid definition for creativity to maximize the objectiveness of the evaluation.