Martin Jaggi


2020

pdf bib
Masking as an Efficient Alternative to Finetuning for Pretrained Language Models
Mengjie Zhao | Tao Lin | Fei Mi | Martin Jaggi | Hinrich Schütze
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning. Extensive evaluations of masking BERT, RoBERTa, and DistilBERT on eleven diverse NLP tasks show that our masking scheme yields performance comparable to finetuning, yet has a much smaller memory footprint when several tasks need to be inferred. Intrinsic evaluations show that representations computed by our binary masked language models encode information necessary for solving downstream tasks. Analyzing the loss landscape, we show that masking and finetuning produce models that reside in minima that can be connected by a line segment with nearly constant test accuracy. This confirms that masking can be utilized as an efficient alternative to finetuning.

2019

pdf bib
Correlating Twitter Language with Community-Level Health Outcomes
Arno Schneuwly | Ralf Grubenmann | Séverine Rion Logean | Mark Cieliebak | Martin Jaggi
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

We study how language on social media is linked to mortal diseases such as atherosclerotic heart disease (AHD), diabetes and various types of cancer. Our proposed model leverages state-of-the-art sentence embeddings, followed by a regression model and clustering, without the need of additional labelled data. It allows to predict community-level medical outcomes from language, and thereby potentially translate these to the individual level. The method is applicable to a wide range of target variables and allows us to discover known and potentially novel correlations of medical outcomes with life-style aspects and other socioeconomic risk factors.

pdf bib
Better Word Embeddings by Disentangling Contextual n-Gram Information
Prakhar Gupta | Matteo Pagliardini | Martin Jaggi
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings. We claim that training word embeddings along with higher n-gram embeddings helps in the removal of the contextual information from the unigrams, resulting in better stand-alone word embeddings. We empirically show the validity of our hypothesis by outperforming other competing word representation models by a significant margin on a wide variety of tasks. We make our models publicly available.

2018

pdf bib
Simple Unsupervised Keyphrase Extraction using Sentence Embeddings
Kamil Bennani-Smires | Claudiu Musat | Andreea Hossmann | Michael Baeriswyl | Martin Jaggi
Proceedings of the 22nd Conference on Computational Natural Language Learning

Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document. Supervised keyphrase extraction requires large amounts of labeled training data and generalizes very poorly outside the domain of the training data. At the same time, unsupervised systems have poor accuracy, and often do not generalize well, as they require the input document to belong to a larger corpus also given as input. Addressing these drawbacks, in this paper, we tackle keyphrase extraction from single documents with EmbedRank: a novel unsupervised method, that leverages sentence embeddings. EmbedRank achieves higher F-scores than graph-based state of the art systems on standard datasets and is suitable for real-time processing of large amounts of Web data. With EmbedRank, we also explicitly increase coverage and diversity among the selected keyphrases by introducing an embedding-based maximal marginal relevance (MMR) for new phrases. A user study including over 200 votes showed that, although reducing the phrases’ semantic overlap leads to no gains in F-score, our high diversity selection is preferred by humans.

pdf bib
Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features
Matteo Pagliardini | Prakhar Gupta | Martin Jaggi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

The recent tremendous success of unsupervised word embeddings in a multitude of applications raises the obvious question if similar methods could be derived to improve embeddings (i.e. semantic representations) of word sequences as well. We present a simple but efficient unsupervised objective to train distributed representations of sentences. Our method outperforms the state-of-the-art unsupervised models on most benchmark tasks, highlighting the robustness of the produced general-purpose sentence embeddings.

2017

pdf bib
Generating Steganographic Text with LSTMs
Tina Fang | Martin Jaggi | Katerina Argyraki
Proceedings of ACL 2017, Student Research Workshop

2016

pdf bib
SwissCheese at SemEval-2016 Task 4: Sentiment Classification Using an Ensemble of Convolutional Neural Networks with Distant Supervision
Jan Deriu | Maurice Gonzenbach | Fatih Uzdilli | Aurelien Lucchi | Valeria De Luca | Martin Jaggi
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Swiss-Chocolate: Combining Flipout Regularization and Random Forests with Artificially Built Subsystems to Boost Text-Classification for Sentiment
Fatih Uzdilli | Martin Jaggi | Dominic Egger | Pascal Julmy | Leon Derczynski | Mark Cieliebak
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Swiss-Chocolate: Sentiment Detection using Sparse SVMs and Part-Of-Speech n-Grams
Martin Jaggi | Fatih Uzdilli | Mark Cieliebak
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)