Martina Miliani


2020

pdf bib
“Voices of the Great War”: A Richly Annotated Corpus of Italian Texts on the First World War
Federico Boschetti | Irene De Felice | Stefano Dei Rossi | Felice Dell’Orletta | Michele Di Giorgio | Martina Miliani | Lucia C. Passaro | Angelica Puddu | Giulia Venturi | Nicola Labanca | Alessandro Lenci | Simonetta Montemagni
Proceedings of the 12th Language Resources and Evaluation Conference

“Voices of the Great War” is the first large corpus of Italian historical texts dating back to the period of First World War. This corpus differs from other existing resources in several respects. First, from the linguistic point of view it gives account of the wide range of varieties in which Italian was articulated in that period, namely from a diastratic (educated vs. uneducated writers), diaphasic (low/informal vs. high/formal registers) and diatopic (regional varieties, dialects) points of view. From the historical perspective, through a collection of texts belonging to different genres it represents different views on the war and the various styles of narrating war events and experiences. The final corpus is balanced along various dimensions, corresponding to the textual genre, the language variety used, the author type and the typology of conveyed contents. The corpus is fully annotated with lemmas, part-of-speech, terminology, and named entities. Significant corpus samples representative of the different “voices” have also been enriched with meta-linguistic and syntactic information. The layer of syntactic annotation forms the first nucleus of an Italian historical treebank complying with the Universal Dependencies standard. The paper illustrates the final resource, the methodology and tools used to build it, and the Web Interface for navigating it.

pdf bib
FRAQUE: a FRAme-based QUEstion-answering system for the Public Administration domain
Martina Miliani | Lucia C. Passaro | Alessandro Lenci
Proceedings of the 1st Workshop on Language Technologies for Government and Public Administration (LT4Gov)

In this paper, we propose FRAQUE, a question answering system for factoid questions in the Public administration domain. The system is based on semantic frames, here intended as collections of slots typed with their possible values. FRAQUE queries unstructured textual data and exploits the potential of different approaches: it extracts pattern elements from texts which are linguistically analyzed through statistical methods.FRAQUE allows Italian users to query vast document repositories related to the domain of Public Administration. Given the statistical nature of most of its components such as word embeddings, the system allows for a flexible domain and language adaptation process. FRAQUE’s goal is to associate questions with frames stored into a Knowledge Graph along with relevant document passages, which are returned as the answer.