Developing a How-to Tip Machine Comprehension Dataset and its Evaluation in Machine Comprehension by BERT

Tengyang Chen, Hongyu Li, Miho Kasamatsu, Takehito Utsuro, Yasuhide Kawada


Abstract
In the field of factoid question answering (QA), it is known that the state-of-the-art technology has achieved an accuracy comparable to that of humans in a certain benchmark challenge. On the other hand, in the area of non-factoid QA, there is still a limited number of datasets for training QA models, i.e., machine comprehension models. Considering such a situation within the field of the non-factoid QA, this paper aims to develop a dataset for training Japanese how-to tip QA models. This paper applies one of the state-of-the-art machine comprehension models to the Japanese how-to tip QA dataset. The trained how-to tip QA model is also compared with a factoid QA model trained with a Japanese factoid QA dataset. Evaluation results revealed that the how-to tip machine comprehension performance was almost comparative with that of the factoid machine comprehension even with the training data size reduced to around 4% of the factoid machine comprehension. Thus, the how-to tip machine comprehension task requires much less training data compared with the factoid machine comprehension task.
Anthology ID:
2020.fever-1.4
Volume:
Proceedings of the Third Workshop on Fact Extraction and VERification (FEVER)
Month:
July
Year:
2020
Address:
Online
Venues:
ACL | FEVER | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
26–35
Language:
URL:
https://www.aclweb.org/anthology/2020.fever-1.4
DOI:
10.18653/v1/2020.fever-1.4
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.fever-1.4.pdf
Video:
 http://slideslive.com/38929661