Hashtag Sense Clustering Based on Temporal Similarity
Giovanni Stilo | Paola Velardi
Computational Linguistics, Volume 43, Issue 1 - April 2017

Hashtags are creative labels used in micro-blogs to characterize the topic of a message/discussion. Regardless of the use for which they were originally intended, hashtags cannot be used as a means to cluster messages with similar content. First, because hashtags are created in a spontaneous and highly dynamic way by users in multiple languages, the same topic can be associated with different hashtags, and conversely, the same hashtag may refer to different topics in different time periods. Second, contrary to common words, hashtag disambiguation is complicated by the fact that no sense catalogs (e.g., Wikipedia or WordNet) are available; and, furthermore, hashtag labels are difficult to analyze, as they often consist of acronyms, concatenated words, and so forth. A common way to determine the meaning of hashtags has been to analyze their context, but, as we have just pointed out, hashtags can have multiple and variable meanings. In this article, we propose a temporal sense clustering algorithm based on the idea that semantically related hashtags have similar and synchronous usage patterns.

What to Write? A topic recommender for journalists
Alessandro Cucchiarelli | Christian Morbidoni | Giovanni Stilo | Paola Velardi
Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism

In this paper we present a recommender system, What To Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest. The basic idea is to characterize an event according to the echo it receives in online news sources and associate it with the corresponding readers’ communicative and informative patterns, detected through the analysis of Twitter and Wikipedia, respectively. Our methodology temporally aligns the results of this analysis and recommends the concepts that emerge as topics of interest from Twitter andWikipedia, either not covered or poorly covered in the published news articles.


Automated learning of everyday patients’ language for medical blogs analytics
Giovanni Stilo | Moreno De Vincenzi | Alberto E. Tozzi | Paola Velardi
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013