Cristina Vertan


2019

pdf bib
Controlled Semi-automatic Annotation of Classical Ethiopic
Cristina Vertan
Proceedings of the Workshop on Language Technology for Digital Historical Archives

Preservation of the cultural heritage by means of digital methods became extremely popular during last years. After intensive digitization campaigns the focus moves slowly from the genuine preservation (i.e digital archiving together with standard search mechanisms) to research-oriented usage of materials available electronically. This usage is intended to go far beyond simple reading of digitized materials; researchers should be able to gain new insigts in materials, discover new facts by means of tools relying on innovative algorithms. In this article we will describe the workflow necessary for the annotation of a dichronic corpus of classical Ethiopic, language of essential importance for the study of Early Christianity

pdf bib
Modelling linguistic vagueness and uncertainty in historical texts
Cristina Vertan
Proceedings of the Workshop on Language Technology for Digital Historical Archives

Many applications in Digital Humanities (DH) rely on annotations of the raw material. These annotations (inferred automatically or done manually) assume that labelled facts are either true or false, thus all inferences started on such annotations us boolean logic. This contradicts hermeneutic principles used by humanites in which most part of the knowledge has a degree of truth which varies depending on the experience and the world knowledge of the interpreter. In this paper we will show how uncertainty and vagueness, two main features of any historical text can be encoded in annotations and thus be considered by DH applications.

2017

bib
Proceedings of the First Workshop on Language technology for Digital Humanities in Central and (South-)Eastern Europe
Anca Dinu | Petya Osenova | Cristina Vertan
Proceedings of the First Workshop on Language technology for Digital Humanities in Central and (South-)Eastern Europe

pdf bib
On the annotation of vague expressions: a case study on Romanian historical texts
Anca Dinu | Walther von Hahn | Cristina Vertan
Proceedings of the First Workshop on Language technology for Digital Humanities in Central and (South-)Eastern Europe

Current approaches in Digital .Humanities tend to ignore a central as-pect of any hermeneutic introspection: the intrinsic vagueness of analyzed texts. Especially when dealing with his-torical documents neglecting vague-ness has important implications on the interpretation of the results. In this pa-per we present current limitation of an-notation approaches and describe a current methodology for annotating vagueness for historical Romanian texts.

2015

pdf bib
Proceedings of the Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects
Preslav Nakov | Marcos Zampieri | Petya Osenova | Liling Tan | Cristina Vertan | Nikola Ljubešić | Jörg Tiedemann
Proceedings of the Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects

2014

pdf bib
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)
Kalliopi Zervanou | Cristina Vertan | Antal van den Bosch | Caroline Sporleder
Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH)

pdf bib
Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants
Preslav Nakov | Petya Osenova | Cristina Vertan
Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants

pdf bib
Proceedings of the Workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA 2014)
Constantin Orasan | Petya Osenova | Cristina Vertan
Proceedings of the Workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA 2014)

pdf bib
Making historical texts accessible to everybody
Cristina Vertan | Walther von Hahn
Proceedings of the Workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA 2014)

2013

pdf bib
Proceedings of the Workshop on Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants
Cristina Vertan | Milena Slavcheva | Petya Osenova
Proceedings of the Workshop on Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants

pdf bib
Language diversity and implications for Language technology in the Multilingual Europe
Cristina Vertan | Walther von Hahn
Proceedings of the Workshop on Adaptation of Language Resources and Tools for Closely Related Languages and Language Variants

pdf bib
A New Syntactic Metric for Evaluation of Machine Translation
Melania Duma | Cristina Vertan | Wolfgang Menzel
51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop

2012

pdf bib
Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation
Monica Gavrila | Walther v. Hahn | Cristina Vertan
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Data-driven machine translation (MT) approaches became very popular during last years, especially for language pairs for which it is difficult to find specialists to develop transfer rules. Statistical (SMT) or example-based (EBMT) systems can provide reasonable translation quality for assimilation purposes, as long as a large amount of training data is available. Especially SMT systems rely on parallel aligned corpora which have to be statistical relevant for the given language pair. The construction of large domain specific parallel corpora is time- and cost-consuming; the current practice relies on one or two big such corpora per language pair. Recent developed strategies ensure certain portability to other domains through specialized lexicons or small domain specific corpora. In this paper we discuss the influence of different discourse styles on statistical machine translation systems. We investigate how a pure SMT performs when training and test data belong to same domain but the discourse style varies.

pdf bib
Two approaches for integrating translation and retrieval in real applications
Cristina Vertan
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

pdf bib
Harnessing NLP Techniques in the Processes of Multilingual Content Management
Anelia Belogay | Diman Karagyozov | Svetla Koeva | Cristina Vertan | Adam Przepiórkowski | Dan Cristea | Plovios Raxis
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Training Data in Statistical Machine Translation - the More, the Better?
Monica Gavrila | Cristina Vertan
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage
Cristina Vertan | Milena Slavcheva | Petya Osenova | Stelios Piperidis
Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage

pdf bib
Using Manual and Parallel Aligned Corpora for Machine Translation Services within an On-line Content Management System
Cristina Vertan | Monica Gavrila
Proceedings of The Second Workshop on Annotation and Exploitation of Parallel Corpora

2010

pdf bib
Towards the Integration of Language Tools Within Historical Digital Libraries
Cristina Vertan
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

During the last years the campaign of mass digitization made available catalogues and valuable rare manuscripts and old printed books vie the Internet. The Manuscriptorium digital library ingested hundreds of olumes and it is expected that the volume will grow up in the next years. Other European initiatives like Europeana and Monasterium have also as central activities the online presentation of cultural heritage. With the growing of the available on-line volumes, a special attention was paid to the management and retrieval of documents within digital libraries. Enabling semantic technologies and intelligent linking and search are a big step forward, but they still do not succeed in making the content of old rare books intelligible to the broad public or specialists in other domains or languages. In this paper we will argue that multilingual language technologies have the potential to fill this gap. We overview the existent language resources for historical documents, and present an architecture which aims at presenting such texts to the normal user, without altering the character of the texts.

2009

pdf bib
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages
Elena Paskaleva | Stelios Piperidis | Milena Slavcheva | Cristina Vertan
Proceedings of the Workshop Multilingual resources, technologies and evaluation for central and Eastern European languages

pdf bib
ProLiV - a Tool for Teaching by Viewing Computational Linguistics
Monica Gavrila | Cristina Vertan
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations

2005

pdf bib
MANAGELEX and the Semantic Web
Monica Gavrila | Cristina Vertan
Proceedings of OntoLex 2005 - Ontologies and Lexical Resources

2004

pdf bib
Language Resources for the Semantic Web – perspectives for Machine Translation –
Cristina Vertan
Proceedings of the Second International Workshop on Language Resources for Translation Work, Research and Training

2003

pdf bib
Menu choice translation: a flexible menu-based controlled natural language system
Cristina Vertan | Walther von Hahn
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

2002

pdf bib
Architectures of “toy” systems for teaching machine translation
Walther v. Hahn | Cristina Vertan
Proceedings of the 6th EAMT Workshop: Teaching Machine Translation