Analyzing ELMo and DistilBERT on Socio-political News Classification

Berfu Büyüköz, Ali Hürriyetoğlu, Arzucan Özgür


Abstract
This study evaluates the robustness of two state-of-the-art deep contextual language representations, ELMo and DistilBERT, on supervised learning of binary protest news classification (PC) and sentiment analysis (SA) of product reviews. A ”cross-context” setting is enabled using test sets that are distinct from the training data. The models are fine-tuned and fed into a Feed-Forward Neural Network (FFNN) and a Bidirectional Long Short Term Memory network (BiLSTM). Multinomial Naive Bayes (MNB) and Linear Support Vector Machine (LSVM) are used as traditional baselines. The results suggest that DistilBERT can transfer generic semantic knowledge to other domains better than ELMo. DistilBERT is also 30% smaller and 83% faster than ELMo, which suggests superiority for smaller computational training budgets. When generalization is not the utmost preference and test domain is similar to the training domain, the traditional machine learning (ML) algorithms can still be considered as more economic alternatives to deep language representations.
Anthology ID:
2020.aespen-1.4
Volume:
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
AESPEN | LREC | WS
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
9–18
Language:
English
URL:
https://www.aclweb.org/anthology/2020.aespen-1.4
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.aespen-1.4.pdf