A topic-based sentence representation for extractive text summarization

Nikolaos Gialitsis, Nikiforos Pittaras, Panagiotis Stamatopoulos


Abstract
In this study, we examine the effect of probabilistic topic model-based word representations, on sentence-based extractive summarization. We formulate the task of summary extraction as a binary classification problem, and we test a variety of machine learning algorithms, exploring a range of different settings. An wide experimental evaluation on the MultiLing 2015 MSS dataset illustrates that topic-based representations can prove beneficial to the extractive summarization process in terms of F1, ROUGE-L and ROUGE-W scores, compared to a TF-IDF baseline, with QDA-based analysis providing the best results.
Anthology ID:
W19-8905
Volume:
Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Venues:
RANLP | WS
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
26–34
Language:
URL:
https://www.aclweb.org/anthology/W19-8905
DOI:
10.26615/978-954-452-058-8_005
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/W19-8905.pdf