Ali Orkan Bayer


pdf bib
Automatic Community Creation for Abstractive Spoken Conversations Summarization
Karan Singla | Evgeny Stepanov | Ali Orkan Bayer | Giuseppe Carenini | Giuseppe Riccardi
Proceedings of the Workshop on New Frontiers in Summarization

Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs. Abstractive summarization techniques rely on linking the summary sentences to sets of original conversation sentences, i.e. communities. Unfortunately, such linking information is rarely available or requires trained annotators. We propose and experiment automatic community creation using cosine similarity on different levels of representation: raw text, WordNet SynSet IDs, and word embeddings. We show that the abstractive summarization systems with automatic communities significantly outperform previously published results on both English and Italian corpora.


pdf bib
The UniTN Discourse Parser in CoNLL 2015 Shared Task: Token-level Sequence Labeling with Argument-specific Models
Evgeny Stepanov | Giuseppe Riccardi | Ali Orkan Bayer
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task


pdf bib
The Development of the Multilingual LUNA Corpus for Spoken Language System Porting
Evgeny Stepanov | Giuseppe Riccardi | Ali Orkan Bayer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The development of annotated corpora is a critical process in the development of speech applications for multiple target languages. While the technology to develop a monolingual speech application has reached satisfactory results (in terms of performance and effort), porting an existing application from a source language to a target language is still a very expensive task. In this paper we address the problem of creating multilingual aligned corpora and its evaluation in the context of a spoken language understanding (SLU) porting task. We discuss the challenges of the manual creation of multilingual corpora, as well as present the algorithms for the creation of multilingual SLU via Statistical Machine Translation (SMT).