João Rodrigues

Also published as: João António Rodrigues


pdf bib
Reproduction and Revival of the Argument Reasoning Comprehension Task
João António Rodrigues | Ruben Branco | João Silva | António Branco
Proceedings of the 12th Language Resources and Evaluation Conference

Reproduction of scientific findings is essential for scientific development across all scientific disciplines and reproducing results of previous works is a basic requirement for validating the hypothesis and conclusions put forward by them. This paper reports on the scientific reproduction of several systems addressing the Argument Reasoning Comprehension Task of SemEval2018. Given a recent publication that pointed out spurious statistical cues in the data set used in the shared task, and that produced a revised version of it, we also evaluated the reproduced systems with this new data set. The exercise reported here shows that, in general, the reproduction of these systems is successful with scores in line with those reported in SemEval2018. However, the performance scores are worst than those, and even below the random baseline, when the reproduced systems are run over the revised data set expunged from data artifacts. This demonstrates that this task is actually a much harder challenge than what could have been perceived from the inflated, close to human-level performance scores obtained with the data set used in SemEval2018. This calls for a revival of this task as there is much room for improvement until systems may come close to the upper bound provided by human performance.

pdf bib
Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness
António Branco | João António Rodrigues | Malgorzata Salawa | Ruben Branco | Chakaveh Saedi
Proceedings of the 28th International Conference on Computational Linguistics

Lexical semantics theories differ in advocating that the meaning of words is represented as an inference graph, a feature mapping or a cooccurrence vector, thus raising the question: is it the case that one of these approaches is superior to the others in representing lexical semantics appropriately? Or in its non antagonistic counterpart: could there be a unified account of lexical semantics where these approaches seamlessly emerge as (partial) renderings of (different) aspects of a core semantic knowledge base? In this paper, we contribute to these research questions with a number of experiments that systematically probe different lexical semantics theories for their levels of cognitive plausibility and of technological usefulness. The empirical findings obtained from these experiments advance our insight on lexical semantics as the feature-based approach emerges as superior to the other ones, and arguably also move us closer to finding answers to the research questions above.


pdf bib
Whom to Learn From? Graph- vs. Text-based Word Embeddings
Małgorzata Salawa | António Branco | Ruben Branco | João António Rodrigues | Chakaveh Saedi
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Vectorial representations of meaning can be supported by empirical data from diverse sources and obtained with diverse embedding approaches. This paper aims at screening this experimental space and reports on an assessment of word embeddings supported (i) by data in raw texts vs. in lexical graphs, (ii) by lexical information encoded in association- vs. inference-based graphs, and obtained (iii) by edge reconstruction- vs. matrix factorisation vs. random walk-based graph embedding methods. The results observed with these experiments indicate that the best solutions with graph-based word embeddings are very competitive, consistently outperforming mainstream text-based ones.


pdf bib
Finely Tuned, 2 Billion Token Based Word Embeddings for Portuguese
João Rodrigues | António Branco
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Semantic Equivalence Detection: Are Interrogatives Harder than Declaratives?
João Rodrigues | Chakaveh Saedi | António Branco | João Silva
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Predicting Brain Activation with WordNet Embeddings
João António Rodrigues | Ruben Branco | João Silva | Chakaveh Saedi | António Branco
Proceedings of the Eight Workshop on Cognitive Aspects of Computational Language Learning and Processing

The task of taking a semantic representation of a noun and predicting the brain activity triggered by it in terms of fMRI spatial patterns was pioneered by Mitchell et al. 2008. That seminal work used word co-occurrence features to represent the meaning of the nouns. Even though the task does not impose any specific type of semantic representation, the vast majority of subsequent approaches resort to feature-based models or to semantic spaces (aka word embeddings). We address this task, with competitive results, by using instead a semantic network to encode lexical semantics, thus providing further evidence for the cognitive plausibility of this approach to model lexical meaning.

pdf bib
WordNet Embeddings
Chakaveh Saedi | António Branco | João António Rodrigues | João Silva
Proceedings of The Third Workshop on Representation Learning for NLP

Semantic networks and semantic spaces have been two prominent approaches to represent lexical semantics. While a unified account of the lexical meaning relies on one being able to convert between these representations, in both directions, the conversion direction from semantic networks into semantic spaces started to attract more attention recently. In this paper we present a methodology for this conversion and assess it with a case study. When it is applied over WordNet, the performance of the resulting embeddings in a mainstream semantic similarity task is very good, substantially superior to the performance of word embeddings based on very large collections of texts like word2vec.


pdf bib
Ways of Asking and Replying in Duplicate Question Detection
João António Rodrigues | Chakaveh Saedi | Vladislav Maraev | João Silva | António Branco
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

This paper presents the results of systematic experimentation on the impact in duplicate question detection of different types of questions across both a number of established approaches and a novel, superior one used to address this language processing task. This study permits to gain a novel insight on the different levels of robustness of the diverse detection methods with respect to different conditions of their application, including the ones that approximate real usage scenarios.


pdf bib
SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task
Rosa Gaudio | Gorka Labaka | Eneko Agirre | Petya Osenova | Kiril Simov | Martin Popel | Dieke Oele | Gertjan van Noord | Luís Gomes | João António Rodrigues | Steven Neale | João Silva | Andreia Querido | Nuno Rendeiro | António Branco
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Adding syntactic structure to bilingual terminology for improved domain adaptation
Mikel Artetxe | Gorka Labaka | Chakaveh Saedi | João Rodrigues | João Silva | António Branco | Eneko Agirre
Proceedings of the 2nd Deep Machine Translation Workshop


pdf bib
Bootstrapping a hybrid deep MT system
João Silva | João Rodrigues | Luís Gomes | António Branco
Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra)

pdf bib
Machine Translation for Multilingual Troubleshooting in the IT Domain: A Comparison of Different Strategies
Sanja Štajner | João Rodrigues | Luís Gomes | António Branco
Proceedings of the 1st Deep Machine Translation Workshop