Ari Klein


2020

pdf bib
Overview of the Fifth Social Media Mining for Health Applications (#SMM4H) Shared Tasks at COLING 2020
Ari Klein | Ilseyar Alimova | Ivan Flores | Arjun Magge | Zulfat Miftahutdinov | Anne-Lyse Minard | Karen O’Connor | Abeed Sarker | Elena Tutubalina | Davy Weissenbacher | Graciela Gonzalez-Hernandez
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task

The vast amount of data on social media presents significant opportunities and challenges for utilizing it as a resource for health informatics. The fifth iteration of the Social Media Mining for Health Applications (#SMM4H) shared tasks sought to advance the use of Twitter data (tweets) for pharmacovigilance, toxicovigilance, and epidemiology of birth defects. In addition to re-runs of three tasks, #SMM4H 2020 included new tasks for detecting adverse effects of medications in French and Russian tweets, characterizing chatter related to prescription medication abuse, and detecting self reports of birth defect pregnancy outcomes. The five tasks required methods for binary classification, multi-class classification, and named entity recognition (NER). With 29 teams and a total of 130 system submissions, participation in the #SMM4H shared tasks continues to grow.

2018

pdf bib
Dealing with Medication Non-Adherence Expressions in Twitter
Takeshi Onishi | Davy Weissenbacher | Ari Klein | Karen O’Connor | Graciela Gonzalez-Hernandez
Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task

Through a semi-automatic analysis of tweets, we show that Twitter users not only express Medication Non-Adherence (MNA) in social media but also their reasons for not complying; further research is necessary to fully extract automatically and analyze this information, in order to facilitate the use of this data in epidemiological studies.

2017

pdf bib
Detecting Personal Medication Intake in Twitter: An Annotated Corpus and Baseline Classification System
Ari Klein | Abeed Sarker | Masoud Rouhizadeh | Karen O’Connor | Graciela Gonzalez
BioNLP 2017

Social media sites (e.g., Twitter) have been used for surveillance of drug safety at the population level, but studies that focus on the effects of medications on specific sets of individuals have had to rely on other sources of data. Mining social media data for this in-formation would require the ability to distinguish indications of personal medication in-take in this media. Towards that end, this paper presents an annotated corpus that can be used to train machine learning systems to determine whether a tweet that mentions a medication indicates that the individual posting has taken that medication at a specific time. To demonstrate the utility of the corpus as a training set, we present baseline results of supervised classification.