Liat Ein Dor

Also published as: Liat Ein-Dor


2020

pdf bib
Active Learning for BERT: An Empirical Study
Liat Ein-Dor | Alon Halfon | Ariel Gera | Eyal Shnarch | Lena Dankin | Leshem Choshen | Marina Danilevsky | Ranit Aharonov | Yoav Katz | Noam Slonim
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Real world scenarios present a challenge for text classification, since labels are usually expensive and the data is often characterized by class imbalance. Active Learning (AL) is a ubiquitous paradigm to cope with data scarcity. Recently, pre-trained NLP models, and BERT in particular, are receiving massive attention due to their outstanding performance in various NLP tasks. However, the use of AL with deep pre-trained models has so far received little consideration. Here, we present a large-scale empirical study on active learning techniques for BERT-based classification, addressing a diverse set of AL strategies and datasets. We focus on practical scenarios of binary text classification, where the annotation budget is very small, and the data is often skewed. Our results demonstrate that AL can boost BERT performance, especially in the most realistic scenario in which the initial set of labeled examples is created using keyword-based queries, resulting in a biased sample of the minority class. We release our research framework, aiming to facilitate future research along the lines explored here.

2019

pdf bib
Financial Event Extraction Using Wikipedia-Based Weak Supervision
Liat Ein-Dor | Ariel Gera | Orith Toledo-Ronen | Alon Halfon | Benjamin Sznajder | Lena Dankin | Yonatan Bilu | Yoav Katz | Noam Slonim
Proceedings of the Second Workshop on Economics and Natural Language Processing

Extraction of financial and economic events from text has previously been done mostly using rule-based methods, with more recent works employing machine learning techniques. This work is in line with this latter approach, leveraging relevant Wikipedia sections to extract weak labels for sentences describing economic events. Whereas previous weakly supervised approaches required a knowledge-base of such events, or corresponding financial figures, our approach requires no such additional data, and can be employed to extract economic events related to companies which are not even mentioned in the training data.

2018

pdf bib
Semantic Relatedness of Wikipedia Concepts – Benchmark Data and a Working Solution
Liat Ein Dor | Alon Halfon | Yoav Kantor | Ran Levy | Yosi Mass | Ruty Rinott | Eyal Shnarch | Noam Slonim
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Learning Thematic Similarity Metric from Article Sections Using Triplet Networks
Liat Ein Dor | Yosi Mass | Alon Halfon | Elad Venezian | Ilya Shnayderman | Ranit Aharonov | Noam Slonim
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

In this paper we suggest to leverage the partition of articles into sections, in order to learn thematic similarity metric between sentences. We assume that a sentence is thematically closer to sentences within its section than to sentences from other sections. Based on this assumption, we use Wikipedia articles to automatically create a large dataset of weakly labeled sentence triplets, composed of a pivot sentence, one sentence from the same section and one from another section. We train a triplet network to embed sentences from the same section closer. To test the performance of the learned embeddings, we create and release a sentence clustering benchmark. We show that the triplet network learns useful thematic metrics, that significantly outperform state-of-the-art semantic similarity methods and multipurpose embeddings on the task of thematic clustering of sentences. We also show that the learned embeddings perform well on the task of sentence semantic similarity prediction.

2015

pdf bib
TR9856: A Multi-word Term Relatedness Benchmark
Ran Levy | Liat Ein-Dor | Shay Hummel | Ruty Rinott | Noam Slonim
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)