Uncertainty Modeling for Machine Comprehension Systems using Efficient Bayesian Neural Networks

Zhengyuan Liu, Pavitra Krishnaswamy, Ai Ti Aw, Nancy Chen


Abstract
While neural approaches have achieved significant improvement in machine comprehension tasks, models often work as a black-box, resulting in lower interpretability, which requires special attention in domains such as healthcare or education. Quantifying uncertainty helps pave the way towards more interpretable neural networks. In classification and regression tasks, Bayesian neural networks have been effective in estimating model uncertainty. However, inference time increases linearly due to the required sampling process in Bayesian neural networks. Thus speed becomes a bottleneck in tasks with high system complexity such as question-answering or dialogue generation. In this work, we propose a hybrid neural architecture to quantify model uncertainty using Bayesian weight approximation but boosts up the inference speed by 80% relative at test time, and apply it for a clinical dialogue comprehension task. The proposed approach is also used to enable active learning so that an updated model can be trained more optimally with new incoming data by selecting samples that are not well-represented in the current training scheme.
Anthology ID:
2020.coling-industry.21
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
Month:
December
Year:
2020
Address:
Online
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
228–235
Language:
URL:
https://www.aclweb.org/anthology/2020.coling-industry.21
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.coling-industry.21.pdf