Semi-Supervised Neural Text Generation by Joint Learning of Natural Language Generation and Natural Language Understanding Models
Raheel Qader | François Portet | Cyril Labbé
Proceedings of the 12th International Conference on Natural Language Generation

In Natural Language Generation (NLG), End-to-End (E2E) systems trained through deep learning have recently gained a strong interest. Such deep models need a large amount of carefully annotated data to reach satisfactory performance. However, acquiring such datasets for every new NLG application is a tedious and time-consuming task. In this paper, we propose a semi-supervised deep learning scheme that can learn from non-annotated data and annotated data when available. It uses a NLG and a Natural Language Understanding (NLU) sequence-to-sequence models which are learned jointly to compensate for the lack of annotation. Experiments on two benchmark datasets show that, with limited amount of annotated data, the method can achieve very competitive results while not using any pre-processing or re-scoring tricks. These findings open the way to the exploitation of non-annotated datasets which is the current bottleneck for the E2E NLG system development to new applications.


Generation of Company descriptions using concept-to-text and text-to-text deep models: dataset collection and systems evaluation
Raheel Qader | Khoder Jneid | François Portet | Cyril Labbé
Proceedings of the 11th International Conference on Natural Language Generation

In this paper we study the performance of several state-of-the-art sequence-to-sequence models applied to generation of short company descriptions. The models are evaluated on a newly created and publicly available company dataset that has been collected from Wikipedia. The dataset consists of around 51K company descriptions that can be used for both concept-to-text and text-to-text generation tasks. Automatic metrics and human evaluation scores computed on the generated company descriptions show promising results despite the difficulty of the task as the dataset (like most available datasets) has not been originally designed for machine learning. In addition, we perform correlation analysis between automatic metrics and human evaluations and show that certain automatic metrics are more correlated to human judgments.


A Personal Storytelling about Your Favorite Data
Cyril Labbé | Claudia Roncancio | Damien Bras
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)