David D. Lewis

Also published as: David Lewis


2016

pdf bib
Open Data Vocabularies for Assigning Usage Rights to Data Resources from Translation Projects
David Lewis | Kaniz Fatema | Alfredo Maldonado | Brian Walshe | Arturo Calvo
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

An assessment of the intellectual property requirements for data used in machine-aided translation is provided based on a recent EC-funded legal review. This is compared against the capabilities offered by current linked open data standards from the W3C for publishing and sharing translation memories from translation projects, and proposals for adequately addressing the intellectual property needs of stakeholders in translation projects using open data vocabularies are suggested.

2015

pdf bib
FALCON: Federated Active Linguistic data CuratiON
David Lewis
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
FALCON: Federated Active Linguistic data CuratiON
David Lewis
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

2014

pdf bib
Global Intelligent Content: Active Curation of Language Resources using Linked Data
David Lewis | Rob Brennan | Leroy Finn | Dominic Jones | Alan Meehan | Declan O’Sullivan | Sebastian Hellmann | Felix Sasaki
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

As language resources start to become available in linked data formats, it becomes relevant to consider how linked data interoperability can play a role in active language processing workflows as well as for more static language resource publishing. This paper proposes that linked data may have a valuable role to play in tracking the use and generation of language resources in such workflows in order to assess and improve the performance of the language technologies that use the resources, based on feedback from the human involvement typically required within such processes. We refer to this as Active Curation of the language resources, since it is performed systematically over language processing workflows to continuously improve the quality of the resource in specific applications, rather than via dedicated curation steps. We use modern localisation workflows, i.e. assisted by machine translation and text analytics services, to explain how linked data can support such active curation. By referencing how a suitable linked data vocabulary can be assembled by combining existing linked data vocabularies and meta-data from other multilingual content processing annotations and tool exchange standards we aim to demonstrate the relative ease with which active curation can be deployed more broadly.

2012

pdf bib
On Using Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market
David Lewis | Alexander O’Connor | Andrzej Zydroń | Gerd Sjögren | Rahzeb Choudhury
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Innovations in localisation have focused on the collection and leverage of language resources. However, smaller localisation clients and Language Service Providers are poorly positioned to exploit the benefits of language resource reuse in comparison to larger companies. Their low throughput of localised content means they have little opportunity to amass significant resources, such as Translation memories and Terminology databases, to reuse between jobs or to train statistical machine translation engines tailored to their domain specialisms and language pairs. We propose addressing this disadvantage via the sharing and pooling of language resources. However, the current localisation standards do not support multiparty sharing, are not well integrated with emerging language resource standards and do not address key requirements in determining ownership and license terms for resources. We survey standards and research in the area of Localisation, Language Resources and Language Technologies to leverage existing localisation standards via Linked Data methodologies. This points to the potential of using semantic representation of existing data models for localisation workflow metadata, terminology, parallel text, provenance and access control, which we illustrate with an RDF example.

2009

pdf bib
Web Service Integration for Next Generation Localisation
David Lewis | Stephen Curran | Kevin Feeney | Zohar Etzioni | John Keeney | Andy Way | Reinhard Schäler
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

1994

pdf bib
Fax: An Alternative to SGML
Kenneth W. Church | William A. Gale | Jonathan I. Helfman | David D. Lewis
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

1993

pdf bib
Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3)
Nancy Chinchor | Lynette Hirschman | David D. Lewis
Computational Linguistics, Volume 19, Number 3, September 1993

1992

pdf bib
Text Filtering in B/IUC-3 and MUC-4
David D. Lewis | Richard M. Tong
Fourth Message Uunderstanding Conference (MUC-4): Proceedings of a Conference Held in McLean, Virginia, June 16-18, 1992

pdf bib
Feature Selection and Feature Extraction for Text Categorization
David D. Lewis
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

1991

pdf bib
Data Extraction as Text Categorization: An Experiment With the MUC-3 Corpus
David D. Lewis
Third Message Uunderstanding Conference (MUC-3): Proceedings of a Conference Held in San Diego, California, May 21-23, 1991

pdf bib
Evaluating Text Categorization I
David D. Lewis
Speech and Natural Language: Proceedings of a Workshop Held at Pacific Grove, California, February 19-22, 1991

1990

pdf bib
Representation Quality in Text Classification: An Introduction and Experiment
David D. Lewis
Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990