Ana Alves


Corpora and Baselines for Humour Recognition in Portuguese
Hugo Gonçalo Oliveira | André Clemêncio | Ana Alves
Proceedings of the 12th Language Resources and Evaluation Conference

Having in mind the lack of work on the automatic recognition of verbal humour in Portuguese, a topic connected with fluency in a natural language, we describe the creation of three corpora, covering two styles of humour and four sources of non-humorous text, that may be used for related studies. We then report on some experiments where the created corpora were used for training and testing computational models that exploit content and linguistic features for humour recognition. The obtained results helped us taking some conclusions about this challenge and may be seen as baselines for those willing to tackle it in the future, using the same corpora.

AIA-BDE: A Corpus of FAQs in Portuguese and their Variations
Hugo Gonçalo Oliveira | João Ferreira | José Santos | Pedro Fialho | Ricardo Rodrigues | Luisa Coheur | Ana Alves
Proceedings of the 12th Language Resources and Evaluation Conference

We present AIA-BDE, a corpus of 380 domain-oriented FAQs in Portuguese and their variations, i.e., paraphrases or entailed questions, created manually, by humans, or automatically, with Google Translate. Its aims to be used as a benchmark for FAQ retrieval and automatic question-answering, but may be useful in other contexts, such as the development of task-oriented dialogue systems, or models for natural language inference in an interrogative context. We also report on two experiments. Matching variations with their original questions was not trivial with a set of unsupervised baselines, especially for manually created variations. Besides high performances obtained with ELMo and BERT embeddings, an Information Retrieval system was surprisingly competitive when considering only the first hit. In the second experiment, text classifiers were trained with the original questions, and tested when assigning each variation to one of three possible sources, or assigning them as out-of-domain. Here, the difference between manual and automatic variations was not so significant.


Can Topic Modelling benefit from Word Sense Information?
Adriana Ferrugento | Hugo Gonçalo Oliveira | Ana Alves | Filipe Rodrigues
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper proposes a new topic model that exploits word sense information in order to discover less redundant and more informative topics. Word sense information is obtained from WordNet and the discovered topics are groups of synsets, instead of mere surface words. A key feature is that all the known senses of a word are considered, with their probabilities. Alternative configurations of the model are described and compared to each other and to LDA, the most popular topic model. However, the obtained results suggest that there are no benefits of enriching LDA with word sense information.


Understanding Urban Land Use through the Visualization of Points of Interest
Evgheni Polisciuc | Ana Alves | Penousal Machado
Proceedings of the Fourth Workshop on Vision and Language

ASAP-II: From the Alignment of Phrases to Textual Similarity
Ana Alves | David Simões | Hugo Gonçalo Oliveira | Adriana Ferrugento
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)


ASAP: Automatic Semantic Alignment for Phrases
Ana Alves | Adriana Ferrugento | Mariana Lourenço | Filipe Rodrigues
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)