Francisco M. Couto

Also published as: Francisco Couto


2020

pdf bib
COVID-19: A Semantic-Based Pipeline for Recommending Biomedical Entities
Marcia Afonso Barros | Andre Lamurias | Diana Sousa | Pedro Ruas | Francisco M. Couto
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

With the increasing number of publications about COVID-19, it is a challenge to extract personalized knowledge suitable for each researcher. This work aims to build a new semantic-based pipeline for recommending biomedical entities to scientific researchers. To this end, we developed a pipeline that creates an implicit feedback matrix based on Named Entity Recognition (NER) on a corpus of documents, using multidisciplinary ontologies for recognizing and linking the entities. Our hypothesis is that by using ontologies from different fields in the NER phase, we can improve the results for state-of-the-art collaborative-filtering recommender systems applied to the dataset created. The tests performed using the COVID-19 Open Research Dataset (CORD-19) dataset show that when using four ontologies, the results for precision@k, for example, reach the 80%, whereas when using only one ontology, the results for precision@k drops to 20%, for the same users. Furthermore, the use of multi-fields entities may help in the discovery of new items, even if the researchers do not have items from that field in their set of preferences.

2019

pdf bib
A Silver Standard Corpus of Human Phenotype-Gene Relations
Diana Sousa | Andre Lamurias | Francisco M. Couto
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Human phenotype-gene relations are fundamental to fully understand the origin of some phenotypic abnormalities and their associated diseases. Biomedical literature is the most comprehensive source of these relations, however, we need Relation Extraction tools to automatically recognize them. Most of these tools require an annotated corpus and to the best of our knowledge, there is no corpus available annotated with human phenotype-gene relations. This paper presents the Phenotype-Gene Relations (PGR) corpus, a silver standard corpus of human phenotype and gene annotations and their relations. The corpus consists of 1712 abstracts, 5676 human phenotype annotations, 13835 gene annotations, and 4283 relations. We generated this corpus using Named-Entity Recognition tools, whose results were partially evaluated by eight curators, obtaining a precision of 87.01%. By using the corpus we were able to obtain promising results with two state-of-the-art deep learning tools, namely 78.05% of precision. The PGR corpus was made publicly available to the research community.

2017

pdf bib
MoRS at SemEval-2017 Task 3: Easy to use SVM in Ranking Tasks
Miguel J. Rodrigues | Francisco M. Couto
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our system, dubbed MoRS (Modular Ranking System), pronounced ‘Morse’, which participated in Task 3 of SemEval-2017. We used MoRS to perform the Community Question Answering Task 3, which consisted on reordering a set of comments according to their usefulness in answering the question in the thread. This was made for a large collection of questions created by a user community. As for this challenge we wanted to go back to simple, easy-to-use, and somewhat forgotten technologies that we think, in the hands of non-expert people, could be reused in their own data sets. Some of our techniques included the annotation of text, the retrieval of meta-data for each comment, POS tagging and Named Entity Recognition, among others. These gave place to syntactical analysis and semantic measurements. Finally we show and discuss our results and the context of our approach, which is part of a more comprehensive system in development, named MoQA.

pdf bib
ULISBOA at SemEval-2017 Task 12: Extraction and classification of temporal expressions and events
Andre Lamurias | Diana Sousa | Sofia Pereira | Luka Clarke | Francisco M. Couto
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper presents our approach to participate in the SemEval 2017 Task 12: Clinical TempEval challenge, specifically in the event and time expressions span and attribute identification subtasks (ES, EA, TS, TA). Our approach consisted in training Conditional Random Fields (CRF) classifiers using the provided annotations, and in creating manually curated rules to classify the attributes of each event and time expression. We used a set of common features for the event and time CRF classifiers, and a set of features specific to each type of entity, based on domain knowledge. Training only on the source domain data, our best F-scores were 0.683 and 0.485 for event and time span identification subtasks. When adding target domain annotations to the training data, the best F-scores obtained were 0.729 and 0.554, for the same subtasks. We obtained the second highest F-score of the challenge on the event polarity subtask (0.708). The source code of our system, Clinical Timeline Annotation (CiTA), is available at https://github.com/lasigeBioTM/CiTA.

2016

pdf bib
Extraction of Regulatory Events using Kernel-based Classifiers and Distant Supervision
Andre Lamurias | Miguel J. Rodrigues | Luka A. Clarke | Francisco M. Couto
Proceedings of the 4th BioNLP Shared Task Workshop

pdf bib
ULISBOA at SemEval-2016 Task 12: Extraction of temporal expressions, clinical events and relations using IBEnt
Marcia Barros | Andre Lamurias | Gonçalo Figueiro | Marta Antunes | Joana Teixeira | Alexandre Pinheiro | Francisco M. Couto
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
ULisboa: Recognition and Normalization of Medical Concepts
André Leal | Bruno Martins | Francisco Couto
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
ULisboa: Identification and Classification of Medical Concepts
André Leal | Diogo Gonçalves | Bruno Martins | Francisco M. Couto
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2013

pdf bib
REACTION: A naive machine learning approach for sentiment classification
Silvio Moreira | João Filgueiras | Bruno Martins | Francisco Couto | Mário J. Silva
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf bib
LASIGE: using Conditional Random Fields and ChEBI ontology
Tiago Grego | Francisco Pinto | Francisco M. Couto
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)