Hitachi at SemEval-2020 Task 7: Stacking at Scale with Heterogeneous Language Models for Humor Recognition

Terufumi Morishita, Gaku Morio, Hiroaki Ozaki, Toshinori Miyoshi


Abstract
This paper describes the winning system for SemEval-2020 task 7: Assessing Humor in Edited News Headlines. Our strategy is Stacking at Scale (SaS) with heterogeneous pre-trained language models (PLMs) such as BERT and GPT-2. SaS first performs fine-tuning on numbers of PLMs with various hyperparameters and then applies a powerful stacking ensemble on top of the fine-tuned PLMs. Our experimental results show that SaS outperforms a naive average ensemble, leveraging weaker PLMs as well as high-performing PLMs. Interestingly, the results show that SaS captured non-funny semantics. Consequently, the system was ranked 1st in all subtasks by significant margins compared with other systems.
Anthology ID:
2020.semeval-1.101
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venues:
*SEMEVAL | COLING
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
791–803
Language:
URL:
https://www.aclweb.org/anthology/2020.semeval-1.101
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.semeval-1.101.pdf