Marie Francine Moens

Also published as: Marie-Francine Moens


2020

pdf bib
Proceedings of the Third International Workshop on Spatial Language Understanding
Parisa Kordjamshidi | Archna Bhatia | Malihe Alikhani | Jason Baldridge | Mohit Bansal | Marie-Francine Moens
Proceedings of the Third International Workshop on Spatial Language Understanding

pdf bib
LIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification
Erfan Ghadery | Marie-Francine Moens
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper presents our system entitled ‘LIIR’ for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2). We have participated in sub-task A for English, Danish, Greek, Arabic, and Turkish languages. We adapt and fine-tune the BERT and Multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages we propose a cross-lingual augmentation approach in order to enrich training data and we use Multilingual BERT to obtain sentence representations.

pdf bib
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Sandro Pezzelle | Dietrich Klakow | Marie-Francine Moens | Zeynep Akata
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
Decoding Language Spatial Relations to 2D Spatial Arrangements
Gorjan Radevski | Guillem Collell | Marie-Francine Moens | Tinne Tuytelaars
Findings of the Association for Computational Linguistics: EMNLP 2020

We address the problem of multimodal spatial understanding by decoding a set of language-expressed spatial relations to a set of 2D spatial arrangements in a multi-object and multi-relationship setting. We frame the task as arranging a scene of clip-arts given a textual description. We propose a simple and effective model architecture Spatial-Reasoning Bert (SR-Bert), trained to decode text to 2D spatial arrangements in a non-autoregressive manner. SR-Bert can decode both explicit and implicit language to 2D spatial arrangements, generalizes to out-of-sample data to a reasonable extent and can generate complete abstract scenes if paired with a clip-arts predictor. Finally, we qualitatively evaluate our method with a user study, validating that our generated spatial arrangements align with human expectation.

pdf bib
Autoregressive Reasoning over Chains of Facts with Transformers
Ruben Cartuyvels | Graham Spinks | Marie-Francine Moens
Proceedings of the 28th International Conference on Computational Linguistics

This paper proposes an iterative inference algorithm for multi-hop explanation regeneration, that retrieves relevant factual evidence in the form of text snippets, given a natural language question and its answer. Combining multiple sources of evidence or facts for multi-hop reasoning becomes increasingly hard when the number of sources needed to make an inference grows. Our algorithm copes with this by decomposing the selection of facts from a corpus autoregressively, conditioning the next iteration on previously selected facts. This allows us to use a pairwise learning-to-rank loss. We validate our method on datasets of the TextGraphs 2019 and 2020 Shared Tasks for explanation regeneration. Existing work on this task either evaluates facts in isolation or artificially limits the possible chains of facts, thus limiting multi-hop inference. We demonstrate that our algorithm, when used with a pre-trained transformer model, outperforms the previous state-of-the-art in terms of precision, training time and inference efficiency.

pdf bib
Representation, Learning and Reasoning on Spatial Language for Downstream NLP Tasks
Parisa Kordjamshidi | James Pustejovsky | Marie-Francine Moens
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Understating spatial semantics expressed in natural language can become highly complex in real-world applications. This includes applications of language grounding, navigation, visual question answering, and more generic human-machine interaction and dialogue systems. In many of such downstream tasks, explicit representation of spatial concepts and relationships can improve the capabilities of machine learning models in reasoning and deep language understanding. In this tutorial, we overview the cutting-edge research results and existing challenges related to spatial language understanding including semantic annotations, existing corpora, symbolic and sub-symbolic representations, qualitative spatial reasoning, spatial common sense, deep and structured learning models. We discuss the recent results on the above-mentioned applications –that need spatial language learning and reasoning – and highlight the research gaps and future directions.

pdf bib
ECHR: Legal Corpus for Argument Mining
Prakash Poudyal | Jaromir Savelka | Aagje Ieven | Marie Francine Moens | Teresa Goncalves | Paulo Quaresma
Proceedings of the 7th Workshop on Argument Mining

In this paper, we publicly release an annotated corpus of 42 decisions of the European Court of Human Rights (ECHR). The corpus is annotated in terms of three types of clauses useful in argument mining: premise, conclusion, and non-argument parts of the text. Furthermore, relationships among the premises and conclusions are mapped. We present baselines for three tasks that lead from unstructured texts to structured arguments. The tasks are argument clause recognition, clause relation prediction, and premise/conclusion recognition. Despite a straightforward application of the bidirectional encoders from Transformers (BERT), we obtained very promising results F1 0.765 on argument recognition, 0.511 on relation prediction, and 0.859/0.628 on premise/conclusion recognition). The results suggest the usefulness of pre-trained language models based on deep neural network architectures in argument mining. Because of the simplicity of the baselines, there is ample space for improvement in future work based on the released corpus.

2019

pdf bib
Talk2Car: Taking Control of Your Self-Driving Car
Thierry Deruyttere | Simon Vandenhende | Dusan Grujicic | Luc Van Gool | Marie-Francine Moens
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

A long-term goal of artificial intelligence is to have an agent execute commands communicated through natural language. In many cases the commands are grounded in a visual environment shared by the human who gives the command and the agent. Execution of the command then requires mapping the command into the physical visual space, after which the appropriate action can be taken. In this paper we consider the former. Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene. Our work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars. We provide a detailed comparison with related datasets such as ReferIt, RefCOCO, RefCOCO+, RefCOCOg, Cityscape-Ref and CLEVR-Ref. Additionally, we include a performance analysis using strong state-of-the-art models. The results show that the proposed object referral task is a challenging one for which the models show promising results but still require additional research in natural language processing, computer vision and the intersection of these fields. The dataset can be found on our website: http://macchina-ai.eu/

pdf bib
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Dietrich Klakow | Sandro Pezzelle | Marie-Francine Moens
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs
Geert Heyman | Bregt Verreet | Ivan Vulić | Marie-Francine Moens
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recent research has discovered that a shared bilingual word embedding space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. However, it has also been shown that for distant language pairs such fully unsupervised self-learning methods are unstable and often get stuck in poor local optima due to reduced isomorphism between starting monolingual spaces. In this work, we propose a new robust framework for learning unsupervised multilingual word embeddings that mitigates the instability issues. We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space. Through the gradual language addition the method can leverage the interdependencies between the new language and all other languages in the current multilingual space. We find that it is beneficial to project more distant languages later in the iterative process. Our fully unsupervised multilingual embedding spaces yield results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state-of-the-art scores on two downstream tasks: multilingual document classification and multilingual dependency parsing, outperforming even supervised baselines. This finding also accentuates the need to establish evaluation protocols for cross-lingual word embeddings beyond the omnipresent intrinsic BLI task in future work.

2018

pdf bib
Generating Continuous Representations of Medical Texts
Graham Spinks | Marie-Francine Moens
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We present an architecture that generates medical texts while learning an informative, continuous representation with discriminative features. During training the input to the system is a dataset of captions for medical X-Rays. The acquired continuous representations are of particular interest for use in many machine learning techniques where the discrete and high-dimensional nature of textual input is an obstacle. We use an Adversarially Regularized Autoencoder to create realistic text in both an unconditional and conditional setting. We show that this technique is applicable to medical texts which often contain syntactic and domain-specific shorthands. A quantitative evaluation shows that we achieve a lower model perplexity than a traditional LSTM generator.

pdf bib
Temporal Information Extraction by Predicting Relative Time-lines
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The current leading paradigm for temporal information extraction from text consists of three phases: (1) recognition of events and temporal expressions, (2) recognition of temporal relations among them, and (3) time-line construction from the temporal relations. In contrast to the first two phases, the last phase, time-line construction, received little attention and is the focus of this work. In this paper, we propose a new method to construct a linear time-line from a set of (extracted) temporal relations. But more importantly, we propose a novel paradigm in which we directly predict start and end-points for events from the text, constituting a time-line without going through the intermediate step of prediction of temporal relations as in earlier work. Within this paradigm, we propose two models that predict in linear complexity, and a new training loss using TimeML-style annotations, yielding promising results.

pdf bib
Word-Level Loss Extensions for Neural Temporal Relation Classification
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 27th International Conference on Computational Linguistics

Unsupervised pre-trained word embeddings are used effectively for many tasks in natural language processing to leverage unlabeled textual data. Often these embeddings are either used as initializations or as fixed word representations for task-specific classification models. In this work, we extend our classification model’s task loss with an unsupervised auxiliary loss on the word-embedding level of the model. This is to ensure that the learned word representations contain both task-specific features, learned from the supervised loss component, and more general features learned from the unsupervised loss component. We evaluate our approach on the task of temporal relation extraction, in particular, narrative containment relation extraction from clinical records, and show that continued training of the embeddings on the unsupervised objective together with the task objective gives better task-specific embeddings, and results in an improvement over the state of the art on the THYME dataset, using only a general-domain part-of-speech tagger as linguistic resource.

pdf bib
A Flexible and Easy-to-use Semantic Role Labeling Framework for Different Languages
Quynh Ngoc Thi Do | Artuur Leeuwenberg | Geert Heyman | Marie-Francine Moens
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

This paper presents a flexible and open source framework for deep semantic role labeling. We aim at facilitating easy exploration of model structures for multiple languages with different characteristics. It provides flexibility in its model construction in terms of word representation, sequence representation, output modeling, and inference styles and comes with clear output visualization. The framework is available under the Apache 2.0 license.

pdf bib
Learning Representations Specialized in Spatial Knowledge: Leveraging Language and Vision
Guillem Collell | Marie-Francine Moens
Transactions of the Association for Computational Linguistics, Volume 6

Spatial understanding is crucial in many real-world problems, yet little progress has been made towards building representations that capture spatial knowledge. Here, we move one step forward in this direction and learn such representations by leveraging a task consisting in predicting continuous 2D spatial arrangements of objects given object-relationship-object instances (e.g., “cat under chair”) and a simple neural network model that learns the task from annotated images. We show that the model succeeds in this task and, furthermore, that it is capable of predicting correct spatial arrangements for unseen objects if either CNN features or word embeddings of the objects are provided. The differences between visual and linguistic features are discussed. Next, to evaluate the spatial representations learned in the previous task, we introduce a task and a dataset consisting in a set of crowdsourced human ratings of spatial similarity for object pairs. We find that both CNN (convolutional neural network) features and word embeddings predict human judgments of similarity well and that these vectors can be further specialized in spatial knowledge if we update them when training the model that predicts spatial arrangements of objects. Overall, this paper paves the way towards building distributed spatial representations, contributing to the understanding of spatial expressions in language.

pdf bib
Do Neural Network Cross-Modal Mappings Really Bridge Modalities?
Guillem Collell | Marie-Francine Moens
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Feed-forward networks are widely used in cross-modal applications to bridge modalities by mapping distributed vectors of one modality to the other, or to a shared space. The predicted vectors are then used to perform e.g., retrieval or labeling. Thus, the success of the whole system relies on the ability of the mapping to make the neighborhood structure (i.e., the pairwise similarities) of the predicted vectors akin to that of the target vectors. However, whether this is achieved has not been investigated yet. Here, we propose a new similarity measure and two ad hoc experiments to shed light on this issue. In three cross-modal benchmarks we learn a large number of language-to-vision and vision-to-language neural network mappings (up to five layers) using a rich diversity of image and text features and loss functions. Our results reveal that, surprisingly, the neighborhood structure of the predicted vectors consistently resembles more that of the input vectors than that of the target vectors. In a second experiment, we further show that untrained nets do not significantly disrupt the neighborhood (i.e., semantic) structure of the input vectors.

pdf bib
Proceedings of the First International Workshop on Spatial Language Understanding
Parisa Kordjamshidi | Archna Bhatia | James Pustejovsky | Marie-Francine Moens
Proceedings of the First International Workshop on Spatial Language Understanding

pdf bib
Evaluating Textual Representations through Image Generation
Graham Spinks | Marie-Francine Moens
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We present a methodology for determining the quality of textual representations through the ability to generate images from them. Continuous representations of textual input are ubiquitous in modern Natural Language Processing techniques either at the core of machine learning algorithms or as the by-product at any given layer of a neural network. While current techniques to evaluate such representations focus on their performance on particular tasks, they don’t provide a clear understanding of the level of informational detail that is stored within them, especially their ability to represent spatial information. The central premise of this paper is that visual inspection or analysis is the most convenient method to quickly and accurately determine information content. Through the use of text-to-image neural networks, we propose a new technique to compare the quality of textual representations by visualizing their information content. The method is illustrated on a medical dataset where the correct representation of spatial information and shorthands are of particular importance. For four different well-known textual representations, we show with a quantitative analysis that some representations are consistently able to deliver higher quality visualizations of the information content. Additionally, we show that the quantitative analysis technique correlates with the judgment of a human expert evaluator in terms of alignment.

2017

pdf bib
Bilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations
Geert Heyman | Ivan Vulić | Marie-Francine Moens
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology. We frame BLI as a classification problem for which we design a neural network based classification architecture composed of recurrent long short-term memory and deep feed forward networks. The results show that word- and character-level representations each improve state-of-the-art results for BLI, and the best results are obtained by exploiting the synergy between these word- and character-level representations in the classification model.

pdf bib
Structured Learning for Temporal Relation Extraction from Clinical Records
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We propose a scalable structured learning model that jointly predicts temporal relations between events and temporal expressions (TLINKS), and the relation between these events and the document creation time (DCTR). We employ a structured perceptron, together with integer linear programming constraints for document-level inference during training and prediction to exploit relational properties of temporality, together with global learning of the relations at the document level. Moreover, this study gives insights in the results of integrating constraints for temporal relation extraction when using structured learning and prediction. Our best system outperforms the state-of-the art on both the CONTAINS TLINK task, and the DCTR task.

pdf bib
Improving Implicit Semantic Role Labeling by Predicting Semantic Frame Arguments
Quynh Ngoc Thi Do | Steven Bethard | Marie-Francine Moens
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Implicit semantic role labeling (iSRL) is the task of predicting the semantic roles of a predicate that do not appear as explicit arguments, but rather regard common sense knowledge or are mentioned earlier in the discourse. We introduce an approach to iSRL based on a predictive recurrent neural semantic frame model (PRNSFM) that uses a large unannotated corpus to learn the probability of a sequence of semantic arguments given a predicate. We leverage the sequence probabilities predicted by the PRNSFM to estimate selectional preferences for predicates and their arguments. On the NomBank iSRL test set, our approach improves state-of-the-art performance on implicit semantic role labeling with less reliance than prior work on manually constructed language resources.

pdf bib
KULeuven-LIIR at SemEval-2017 Task 12: Cross-Domain Temporal Information Extraction from Clinical Records
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this paper, we describe the system of the KULeuven-LIIR submission for Clinical TempEval 2017. We participated in all six subtasks, using a combination of Support Vector Machines (SVM) for event and temporal expression detection, and a structured perceptron for extracting temporal relations. Moreover, we present and analyze the results from our submissions, and verify the effectiveness of several system components. Our system performed above average for all subtasks in both phases.

pdf bib
Learning to Recognize Animals by Watching Documentaries: Using Subtitles as Weak Supervision
Aparna Nurani Venkitasubramanian | Tinne Tuytelaars | Marie-Francine Moens
Proceedings of the Sixth Workshop on Vision and Language

We investigate animal recognition models learned from wildlife video documentaries by using the weak supervision of the textual subtitles. This is a particularly challenging setting, since i) the animals occur in their natural habitat and are often largely occluded and ii) subtitles are to a large degree complementary to the visual content, providing a very weak supervisory signal. This is in contrast to most work on integrated vision and language in the literature, where textual descriptions are tightly linked to the image content, and often generated in a curated fashion for the task at hand. In particular, we investigate different image representations and models, including a support vector machine on top of activations of a pretrained convolutional neural network, as well as a Naive Bayes framework on a ‘bag-of-activations’ image representation, where each element of the bag is considered separately. This representation allows key components in the image to be isolated, in spite of largely varying backgrounds and image clutter, without an object detection or image segmentation step. The methods are evaluated based on how well they transfer to unseen camera-trap images captured across diverse topographical regions under different environmental conditions and illumination settings, involving a large domain shift.

2016

pdf bib
Semi-automatically Alignment of Predicates between Speech and OntoNotes data
Niraj Shrestha | Marie-Francine Moens
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Speech data currently receives a growing attention and is an important source of information. We still lack suitable corpora of transcribed speech annotated with semantic roles that can be used for semantic role labeling (SRL), which is not the case for written data. Semantic role labeling in speech data is a challenging and complex task due to the lack of sentence boundaries and the many transcription errors such as insertion, deletion and misspellings of words. In written data, SRL evaluation is performed at the sentence level, but in speech data sentence boundaries identification is still a bottleneck which makes evaluation more complex. In this work, we semi-automatically align the predicates found in transcribed speech obtained with an automatic speech recognizer (ASR) with the predicates found in the corresponding written documents of the OntoNotes corpus and manually align the semantic roles of these predicates thus obtaining annotated semantic frames in the speech data. This data can serve as gold standard alignments for future research in semantic role labeling of speech data.

pdf bib
Facing the most difficult case of Semantic Role Labeling: A collaboration of word embeddings and co-training
Quynh Ngoc Thi Do | Steven Bethard | Marie-Francine Moens
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We present a successful collaboration of word embeddings and co-training to tackle in the most difficult test case of semantic role labeling: predicting out-of-domain and unseen semantic frames. Despite the fact that co-training is a successful traditional semi-supervised method, its application in SRL is very limited especially when a huge amount of labeled data is available. In this work, co-training is used together with word embeddings to improve the performance of a system trained on a large training dataset. We also introduce a semantic role labeling system with a simple learning architecture and effective inference that is easily adaptable to semi-supervised settings with new training data and/or new features. On the out-of-domain testing set of the standard benchmark CoNLL 2009 data our simple approach achieves high performance and improves state-of-the-art results.

pdf bib
Is an Image Worth More than a Thousand Words? On the Fine-Grain Semantic Differences between Visual and Linguistic Representations
Guillem Collell | Marie-Francine Moens
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Human concept representations are often grounded with visual information, yet some aspects of meaning cannot be visually represented or are better described with language. Thus, vision and language provide complementary information that, properly combined, can potentially yield more complete concept representations. Recently, state-of-the-art distributional semantic models and convolutional neural networks have achieved great success in representing linguistic and visual knowledge respectively. In this paper, we compare both, visual and linguistic representations in their ability to capture different types of fine-grain semantic knowledge—or attributes—of concepts. Humans often describe objects using attributes, that is, properties such as shape, color or functionality, which often transcend the linguistic and visual modalities. In our setting, we evaluate how well attributes can be predicted by using the unimodal representations as inputs. We are interested in first, finding out whether attributes are generally better captured by either the vision or by the language modality; and second, if none of them is clearly superior (as we hypothesize), what type of attributes or semantic knowledge are better encoded from each modality. Ultimately, our study sheds light on the potential of combining visual and textual representations.

pdf bib
Multi-Modal Representations for Improved Bilingual Lexicon Learning
Ivan Vulić | Douwe Kiela | Stephen Clark | Marie-Francine Moens
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Dataset for Multimodal Question Answering in the Cultural Heritage Domain
Shurong Sheng | Luc Van Gool | Marie-Francine Moens
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

Multimodal question answering in the cultural heritage domain allows visitors to ask questions in a more natural way and thus provides better user experiences with cultural objects while visiting a museum, landmark or any other historical site. In this paper, we introduce the construction of a golden standard dataset that will aid research of multimodal question answering in the cultural heritage domain. The dataset, which will be soon released to the public, contains multimodal content including images of typical artworks from the fascinating old-Egyptian Amarna period, related image-containing documents of the artworks and over 800 multimodal queries integrating visual and textual questions. The multimodal questions and related documents are all in English. The multimodal questions are linked to relevant paragraphs in the related documents that contain the answer to the multimodal query.

pdf bib
Visualizing the Content of a Children’s Story in a Virtual World: Lessons Learned
Quynh Ngoc Thi Do | Steven Bethard | Marie-Francine Moens
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods

pdf bib
KULeuven-LIIR at SemEval 2016 Task 12: Detecting Narrative Containment in Clinical Records
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Adapting Coreference Resolution for Narrative Processing
Quynh Ngoc Thi Do | Steven Bethard | Marie-Francine Moens
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Information Extraction from Biomedical Texts: Learning Models with Limited Supervision
Marie-Francine Moens
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis

pdf bib
Proceedings of the Fourth Workshop on Vision and Language
Anja Belz | Luisa Coheur | Vittorio Ferrari | Marie-Francine Moens | Katerina Pastra | Ivan Vulić
Proceedings of the Fourth Workshop on Vision and Language

pdf bib
Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction
Ivan Vulić | Marie-Francine Moens
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
TKLBLIIR: Detecting Twitter Paraphrases with TweetingJay
Mladen Karan | Goran Glavaš | Jan Šnajder | Bojana Dalbelo Bašić | Ivan Vulić | Marie-Francine Moens
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
SemEval-2015 Task 8: SpaceEval
James Pustejovsky | Parisa Kordjamshidi | Marie-Francine Moens | Aaron Levine | Seth Dworman | Zachary Yocum
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
Oleksandr Kolomiyets | Marie-Francine Moens | Martha Palmer | James Pustejovsky | Steven Bethard
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

pdf bib
Key Event Detection in Video using ASR and Visual Data
Niraj Shrestha | Aparna N. Venkitasubramanian | Marie-Francine Moens
Proceedings of the Third Workshop on Vision and Language

pdf bib
Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data
Ivan Vulić | Marie-Francine Moens
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
HiEve: A Corpus for Extracting Event Hierarchies from News Stories
Goran Glavaš | Jan Šnajder | Marie-Francine Moens | Parisa Kordjamshidi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In news stories, event mentions denote real-world events of different spatial and temporal granularity. Narratives in news stories typically describe some real-world event of coarse spatial and temporal granularity along with its subevents. In this work, we present HiEve, a corpus for recognizing relations of spatiotemporal containment between events. In HiEve, the narratives are represented as hierarchies of events based on relations of spatiotemporal containment (i.e., superevent―subevent relations). We describe the process of manual annotation of HiEve. Furthermore, we build a supervised classifier for recognizing spatiotemporal containment between events to serve as a baseline for future research. Preliminary experimental results are encouraging, with classifier performance reaching 58% F1-score, only 11% less than the inter annotator agreement.

2013

pdf bib
A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else)
Ivan Vulić | Marie-Francine Moens
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Detecting Relations in the Gene Regulation Network
Thomas Provoost | Marie-Francine Moens
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
Cross-Lingual Semantic Similarity of Words as the Similarity of Their Semantic Word Responses
Ivan Vulić | Marie-Francine Moens
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
KUL: Data-driven Approach to Temporal Parsing of Newswire Articles
Oleksandr Kolomiyets | Marie-Francine Moens
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf bib
SemEval-2013 Task 3: Spatial Role Labeling
Oleksandr Kolomiyets | Parisa Kordjamshidi | Marie-Francine Moens | Steven Bethard
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
SemEval-2012 Task 3: Spatial Role Labeling
Parisa Kordjamshidi | Steven Bethard | Marie-Francine Moens
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf bib
Extracting Narrative Timelines as Temporal Dependency Structures
Oleksandr Kolomiyets | Steven Bethard | Marie-Francine Moens
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Annotating Story Timelines as Temporal Dependency Structures
Steven Bethard | Oleksandr Kolomiyets | Marie-Francine Moens
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present an approach to annotating timelines in stories where events are linked together by temporal relations into a temporal dependency tree. This approach avoids the disconnected timeline problems of prior work, and results in timelines that are more suitable for temporal reasoning. We show that annotating timelines as temporal dependency trees is possible with high levels of inter-annotator agreement - Krippendorff's Alpha of 0.822 on selecting event pairs, and of 0.700 on selecting temporal relation labels - even with the moderately sized relation set of BEFORE, AFTER, INCLUDES, IS-INCLUDED, IDENTITY and OVERLAP. We also compare several annotation schemes for identifying story events, and show that higher inter-annotator agreement can be reached by focusing on only the events that are essential to forming the timeline, skipping words in negated contexts, modal contexts and quoted speech.

pdf bib
KU Leuven at HOO-2012: A Hybrid Approach to Detection and Correction of Determiner and Preposition Errors in Non-native English Text
Li Quan | Oleksandr Kolomiyets | Marie-Francine Moens
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

pdf bib
Skip N-grams and Ranking Functions for Predicting Script Events
Bram Jans | Steven Bethard | Ivan Vulić | Marie Francine Moens
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge
Ivan Vulić | Marie-Francine Moens
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Sub-corpora Sampling with an Application to Bilingual Lexicon Extraction
Ivan Vulić | Marie-Francine Moens
Proceedings of COLING 2012

pdf bib
Coreference Clustering using Column Generation
Jan De Belder | Marie-Francine Moens
Proceedings of COLING 2012: Posters

2011

pdf bib
Model-Portability Experiments for Textual Temporal Analysis
Oleksandr Kolomiyets | Steven Bethard | Marie-Francine Moens
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Identifying Word Translations from Comparable Corpora Using Latent Topic Models
Ivan Vulić | Wim De Smet | Marie-Francine Moens
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
KUL: Recognition and Normalization of Temporal Expressions
Oleksandr Kolomiyets | Marie-Francine Moens
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Spatial Role Labeling: Task Definition and Annotation Scheme
Parisa Kordjamshidi | Martijn Van Otterlo | Marie-Francine Moens
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

One of the essential functions of natural language is to talk about spatial relationships between objects. Linguistic constructs can express highly complex, relational structures of objects, spatial relations between them, and patterns of motion through spaces relative to some reference point. Learning how to map this information onto a formal representation from a text is a challenging problem. At present no well-defined framework for automatic spatial information extraction exists that can handle all of these issues. In this paper we introduce the task of spatial role labeling and propose an annotation scheme that is language-independent and facilitates the application of machine learning techniques. Our framework consists of a set of spatial roles based on the theory of holistic spatial semantics with the intent of covering all aspects of spatial concepts, including both static and dynamic spatial relations. We illustrate our annotation scheme with many examples throughout the paper, and in addition we highlight how to connect to spatial calculi such as region connection calculus and also how our approach fits into related work.

2009

pdf bib
Semi-supervised Semantic Role Labeling Using the Latent Words Language Model
Koen Deschacht | Marie-Francine Moens
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Meeting TempEval-2: Shallow Approach for Temporal Tagger
Oleksandr Kolomiyets | Marie-Francine Moens
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
Proceedings of the 2009 Workshop on Knowledge and Reasoning for Answering Questions (KRAQ 2009)
Patrick Saint-Dizier | Marie-Francine Moens
Proceedings of the 2009 Workshop on Knowledge and Reasoning for Answering Questions (KRAQ 2009)

2008

pdf bib
Language Resources for Studying Argument
Chris Reed | Raquel Mochales Palau | Glenn Rowe | Marie-Francine Moens
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the development of a written corpus of argumentative reasoning. Arguments in the corpus have been analysed using state of the art techniques from argumentation theory and have been marked up using an open, reusable markup language. A number of the key challenges enountered during the process are explored, and preliminary observations about features such as inter-coder reliability and corpus statistics are discussed. In addition, several examples are offered of how this kind of language resource can be used in linguistic, computational and philosophical research, and in particular, how the corpus has been used to initiate a programme investigating the automatic detection of argumentative structure.

pdf bib
Coling 2008: Proceedings of the workshop on Knowledge and Reasoning for Answering Questions
Marie-Francine Moens | Patrick Saint-Dizier
Coling 2008: Proceedings of the workshop on Knowledge and Reasoning for Answering Questions

2007

pdf bib
Text Analysis for Automatic Image Annotation
Koen Deschacht | Marie-Francine Moens
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Efficient Hierarchical Entity Classifier Using Conditional Random Fields
Koen Deschacht | Marie-Francine Moens
Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge

pdf bib
Measuring Aboutness of an Entity in a Text
Marie-Francine Moens | Patrick Jeuniaux | Roxana Angheluta | Rudradeb Mitra
Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing

2002

pdf bib
Semantic Case Role Detection for Information Extraction
Rik De Busser | Roxana Angheluta | Marie-Francine Moens
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes