Alon Jacovi


2020

pdf bib
Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?
Alon Jacovi | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion piece we reflect on the current state of interpretability evaluation research. We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria. We survey the literature with respect to faithfulness evaluation, and arrange the current approaches around three assumptions, providing an explicit form to how faithfulness is “defined” by the community. We provide concrete guidelines on how evaluation of interpretation methods should and should not be conducted. Finally, we claim that the current binary definition for faithfulness sets a potentially unrealistic bar for being considered faithful. We call for discarding the binary notion of faithfulness in favor of a more graded one, which we believe will be of greater practical utility.

pdf bib
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shachar Rosenman | Alon Jacovi | Yoav Goldberg
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The process of collecting and annotating training data may introduce distribution artifacts which may limit the ability of models to learn correct generalization behavior. We identify failure modes of SOTA relation extraction (RE) models trained on TACRED, which we attribute to limitations in the data annotation process. We collect and annotate a challenge-set we call Challenging RE (CRE), based on naturally occurring corpus examples, to benchmark this behavior. Our experiments with four state-of-the-art RE models show that they have indeed adopted shallow heuristics that do not generalize to the challenge-set data. Further, we find that alternative question answering modeling performs significantly better than the SOTA models on the challenge-set, despite worse overall TACRED performance. By adding some of the challenge data as training examples, the performance of the model improves. Finally, we provide concrete suggestion on how to improve RE data collection to alleviate this behavior.

2019

bib
Learning and Understanding Different Categories of Sexism Using Convolutional Neural Network’s Filters
Sima Sharifirad | Alon Jacovi
Proceedings of the 2019 Workshop on Widening NLP

Sexism is very common in social media and makes the boundaries of free speech tighter for female users. Automatically flagging and removing sexist content requires niche identification and description of the categories. In this study, inspired by social science work, we propose three categories of sexism toward women as follows: “Indirect sexism”, “Sexual sexism” and “Physical sexism”. We build classifiers such as Convolutional Neural Network (CNN) to automatically detect different types of sexism and address problems of annotation. Even though inherent non-interpretability of CNN is a challenge for users who detect sexism, as the reason classifying a given speech instance with regard to sexism is difficult to glance from a CNN. However, recent research developed interpretable CNN filters for text data. In a CNN, filters followed by different activation patterns along with global max-pooling can help us tease apart the most important ngrams from the rest. In this paper, we interpret a CNN model trained to classify sexism in order to understand different categories of sexism by detecting semantic categories of ngrams and clustering them. Then, these ngrams in each category are used to improve the performance of the classification task. It is a preliminary work using machine learning and natural language techniques to learn the concept of sexism and distinguishes itself by looking at more precise categories of sexism in social media along with an in-depth investigation of CNN’s filters.

2018

pdf bib
Understanding Convolutional Neural Networks for Text Classification
Alon Jacovi | Oren Sar Shalom | Yoav Goldberg
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text. CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery. We aim to understand the method by which the networks process and classify text. We examine common hypotheses to this problem: that filters, accompanied by global max-pooling, serve as ngram detectors. We show that filters may capture several different semantic classes of ngrams by using different activation patterns, and that global max-pooling induces behavior which separates important ngrams from the rest. Finally, we show practical use cases derived from our findings in the form of model interpretability (explaining a trained model by deriving a concrete identity for each filter, bridging the gap between visualization tools in vision tasks and NLP) and prediction interpretability (explaining predictions).