Analysis of Twitter Data for Postmarketing Surveillance in Pharmacovigilance

Julie Pain, Jessie Levacher, Adam Quinquenel, Anja Belz


Abstract
Postmarketing surveillance (PMS) has the vital aim to monitor effects of drugs after release for use by the general population, but suffers from under-reporting and limited coverage. Automatic methods for detecting drug effect reports, especially for social media, could vastly increase the scope of PMS. Very few automatic PMS methods are currently available, in particular for the messy text types encountered on Twitter. In this paper we describe first results for developing PMS methods specifically for tweets. We describe the corpus of 125,669 tweets we have created and annotated to train and test the tools. We find that generic tools perform well for tweet-level language identification and tweet-level sentiment analysis (both 0.94 F1-Score). For detection of effect mentions we are able to achieve 0.87 F1-Score, while effect-level adverse-vs.-beneficial analysis proves harder with an F1-Score of 0.64. Among other things, our results indicate that MetaMap semantic types provide a very promising basis for identifying drug effect mentions in tweets.
Anthology ID:
W16-3914
Volume:
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
WNUT | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
94–101
Language:
URL:
https://www.aclweb.org/anthology/W16-3914
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W16-3914.pdf