Efficient Deployment of Conversational Natural Language Interfaces over Databases

Anthony Colas, Trung Bui, Franck Dernoncourt, Moumita Sinha, Doo Soon Kim


Abstract
Many users communicate with chatbots and AI assistants in order to help them with various tasks. A key component of the assistant is the ability to understand and answer a user’s natural language questions for question-answering (QA). Because data can be usually stored in a structured manner, an essential step involves turning a natural language question into its corresponding query language. However, in order to train most natural language-to-query-language state-of-the-art models, a large amount of training data is needed first. In most domains, this data is not available and collecting such datasets for various domains can be tedious and time-consuming. In this work, we propose a novel method for accelerating the training dataset collection for developing the natural language-to-query-language machine learning models. Our system allows one to generate conversational multi-term data, where multiple turns define a dialogue session, enabling one to better utilize chatbot interfaces. We train two current state-of-the-art NL-to-QL models, on both an SQL and SPARQL-based datasets in order to showcase the adaptability and efficacy of our created data.
Anthology ID:
2020.nli-1.4
Volume:
Proceedings of the First Workshop on Natural Language Interfaces
Month:
July
Year:
2020
Address:
Online
Venues:
ACL | NLI | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27–36
Language:
URL:
https://www.aclweb.org/anthology/2020.nli-1.4
DOI:
10.18653/v1/2020.nli-1.4
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.nli-1.4.pdf
Video:
 http://slideslive.com/38929798