Market Comment Generation from Data with Noisy Alignments

Yumi Hamazono, Yui Uehara, Hiroshi Noji, Yusuke Miyao, Hiroya Takamura, Ichiro Kobayashi


Abstract
End-to-end models on data-to-text learn the mapping of data and text from the aligned pairs in the dataset. However, these alignments are not always obtained reliably, especially for the time-series data, for which real time comments are given to some situation and there might be a delay in the comment delivery time compared to the actual event time. To handle this issue of possible noisy alignments in the dataset, we propose a neural network model with multi-timestep data and a copy mechanism, which allows the models to learn the correspondences between data and text from the dataset with noisier alignments. We focus on generating market comments in Japanese that are delivered each time an event occurs in the market. The core idea of our approach is to utilize multi-timestep data, which is not only the latest market price data when the comment is delivered, but also the data obtained at several timesteps earlier. On top of this, we employ a copy mechanism that is suitable for referring to the content of data records in the market price data. We confirm the superiority of our proposal by two evaluation metrics and show the accuracy improvement of the sentence generation using the time series data by our proposed method.
Anthology ID:
2020.inlg-1.21
Volume:
Proceedings of the 13th International Conference on Natural Language Generation
Month:
December
Year:
2020
Address:
Dublin, Ireland
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
148–157
Language:
URL:
https://www.aclweb.org/anthology/2020.inlg-1.21
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.inlg-1.21.pdf