Transformer-based Model for Single Documents Neural Summarization

Elozino Egonmwan, Yllias Chali


Abstract
We propose a system that improves performance on single document summarization task using the CNN/DailyMail and Newsroom datasets. It follows the popular encoder-decoder paradigm, but with an extra focus on the encoder. The intuition is that the probability of correctly decoding an information significantly lies in the pattern and correctness of the encoder. Hence we introduce, encode –encode – decode. A framework that encodes the source text first with a transformer, then a sequence-to-sequence (seq2seq) model. We find that the transformer and seq2seq model complement themselves adequately, making for a richer encoded vector representation. We also find that paying more attention to the vocabulary of target words during abstraction improves performance. We experiment our hypothesis and framework on the task of extractive and abstractive single document summarization and evaluate using the standard CNN/DailyMail dataset and the recently released Newsroom dataset.
Anthology ID:
D19-5607
Volume:
Proceedings of the 3rd Workshop on Neural Generation and Translation
Month:
November
Year:
2019
Address:
Hong Kong
Venues:
EMNLP | NGT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–79
Language:
URL:
https://www.aclweb.org/anthology/D19-5607
DOI:
10.18653/v1/D19-5607
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/D19-5607.pdf