Using Paraphrasing and Memory-Augmented Models to Combat Data Sparsity in Question Interpretation with a Virtual Patient Dialogue System
Lifeng Jin, David King, Amad Hussein, Michael White, Douglas Danforth
Abstract
When interpreting questions in a virtual patient dialogue system one must inevitably tackle the challenge of a long tail of relatively infrequently asked questions. To make progress on this challenge, we investigate the use of paraphrasing for data augmentation and neural memory-based classification, finding that the two methods work best in combination. In particular, we find that the neural memory-based approach not only outperforms a straight CNN classifier on low frequency questions, but also takes better advantage of the augmented data created by paraphrasing, together yielding a nearly 10% absolute improvement in accuracy on the least frequently asked questions.- Anthology ID:
- W18-0502
- Volume:
- Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Venues:
- BEA | NAACL | WS
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13–23
- Language:
- URL:
- https://www.aclweb.org/anthology/W18-0502
- DOI:
- 10.18653/v1/W18-0502
- PDF:
- http://aclanthology.lst.uni-saarland.de/W18-0502.pdf