David Doermann


2012

pdf bib
Linguistic Resources for Handwriting Recognition and Translation Evaluation
Zhiyi Song | Safa Ismael | Stephen Grimes | David Doermann | Stephanie Strassel
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe efforts to create corpora to support development and evaluation of handwriting recognition and translation technology. LDC has developed a stable pipeline and infrastructures for collecting and annotating handwriting linguistic resources to support the evaluation of MADCAT and OpenHaRT. We collect and annotate handwritten samples of pre-processed Arabic and Chinese data that has been already translated in English that is used in the GALE program. To date, LDC has recruited more than 600 scribes and collected, annotated and released more than 225,000 handwriting images. Most linguistic resources created for these programs will be made available to the larger research community by publishing in LDC's catalog. The phase 1 MADCAT corpus is now available.

pdf bib
A Random Forest System Combination Approach for Error Detection in Digital Dictionaries
Michael Bloodgood | Peng Ye | Paul Rodrigues | David Zajic | David Doermann
Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data

pdf bib
Leveraging Statistical Transliteration for Dictionary-Based English-Bengali CLIR of OCR‘d Text
Utpal Garain | Arjun Das | David Doermann | Douglas Oard
Proceedings of COLING 2012: Posters

2011

pdf bib
Cross-Language Entity Linking
Paul McNamee | James Mayfield | Dawn Lawrie | Douglas Oard | David Doermann
Proceedings of 5th International Joint Conference on Natural Language Processing

2006

pdf bib
Morphology Induction from Limited Noisy Data Using Approximate String Matching
Burcu Karagol-Ayan | David Doermann | Amy Weinberg
Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology at HLT-NAACL 2006

pdf bib
Adaptive Transformation-Based Learning for Improving Dictionary Tagging
Burcu Karagol-Ayan | David Doermann | Amy Weinberg
11th Conference of the European Chapter of the Association for Computational Linguistics

2003

pdf bib
Desparately Seeking Cebuano
Douglas W. Oard | David Doermann | Bonnie Dorr | Daqing He | Philip Resnik | Amy Weinberg | William Byrne | Sanjeev Khudanpur | David Yarowsky | Anton Leuski | Philipp Koehn | Kevin Knight
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers