Keith J. Miller

Also published as: Keith Miller


pdf bib
International Multicultural Name Matching Competition: Design, Execution, Results, and Lessons Learned
Keith J. Miller | Elizabeth Schroeder Richerson | Sarah McLeod | James Finley | Aaron Schein
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes different aspects of an open competition to evaluate multicultural name matching software, including the contest design, development of the test data, different phases of the competition, behavior of the participating teams, results of the competition, and lessons learned throughout. The competition, known as The MITRE Challenge™, was informally announced at LREC 2010 and was recently concluded. Contest participants used the competition website ( to download the competition data set and guidelines, upload results, and to view accuracy metrics for each result set submitted. Participants were allowed to submit unlimited result sets, with their top-scoring set determining their overall ranking. The competition website featured a leader board that displayed the top score for each participant, ranked according to the principal contest metric - mean average precision (MAP). MAP and other metrics were calculated in near-real time on a remote server, based on ground truth developed for the competition data set. Additional measures were taken to guard against gaming the competition metric or overfitting to the competition data set. Lessons learned during running this first MITRE Challenge will be valuable to others considering running similar evaluation campaigns.


pdf bib
Improving Personal Name Search in the TIGR System
Keith J. Miller | Sarah McLeod | Elizabeth Schroeder | Mark Arehart | Kenneth Samuel | James Finley | Vanesa Jurica | John Polk
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes the development and evaluation of enhancements to the specialized information retrieval capabilities of a multimodal reporting system. The system enables collection and dissemination of information through a distributed data architecture by allowing users to input free text documents, which are indexed for subsequent search and retrieval by other users. This unstructured data entry method is essential for users of this system, but it requires an intelligent support system for processing queries against the data. The system, known as TIGR (Tactical Ground Reporting), allows keyword searching and geospatial filtering of results, but lacked the ability to efficiently index and search person names and perform approximate name matching. To improve TIGR’s ability to provide accurate, comprehensive results for queries on person names we iteratively updated existing entity extraction and name matching technologies to better align with the TIGR use case. We evaluated each version of the entity extraction and name matching components to find the optimal configuration for the TIGR context, and combined those pieces into a named entity extraction, indexing, and search module that integrates with the current TIGR system. By comparing system-level evaluations of the original and updated TIGR search processes, we show that our enhancements to personal name search significantly improved the performance of the overall information retrieval capabilities of the TIGR system.


pdf bib
Adjudicator Agreement and System Rankings for Person Name Search
Mark Arehart | Chris Wolf | Keith J. Miller
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We have analyzed system rankings for person name search algorithms using a data set for which several versions of ground truth were developed by employing different means of resolving adjudicator conflicts. Thirteen algorithms were ranked by F-score, using bootstrap resampling for significance testing, on a dataset containing 70,000 romanized names from various cultures. We found some disagreement among the four adjudicators, with kappa ranging from 0.57 to 0.78. Truth sets based on a single adjudicator, and on the intersection or union of positive adjudications produced sizeable variability in scoring sensitivity - and to a lesser degree rank order - compared to the consensus truth set. However, results on truth sets constructed by randomly choosing an adjudicator for each item were highly consistent with the consensus. The implication is that an evaluation where one adjudicator has judged each item is nearly as good as a more expensive and labor-intensive one where multiple adjudicators have judged each item and conflicts are resolved through voting.

pdf bib
A Ground Truth Dataset for Matching Culturally Diverse Romanized Person Names
Mark Arehart | Keith J. Miller
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes the development of a ground truth dataset of culturally diverse Romanized names in which approximately 70,000 names are matched against a subset of 700. We ran the subset as queries against the complete list using several matchers, created adjudication pools, adjudicated the results, and compiled two versions of ground truth based on different sets of adjudication guidelines and methods for resolving adjudicator conflicts. The name list, drawn from publicly available sources, was manually seeded with over 1500 name variants. These names include transliteration variation, database fielding errors, segmentation differences, incomplete names, titles, initials, abbreviations, nicknames, typos, OCR errors, and truncated data. These diverse types of matches, along with the coincidental name similarities already in the list, make possible a comprehensive evaluation of name matching systems. We have used the dataset to evaluate several open source and commercial algorithms and provide some of those results.

pdf bib
An Infrastructure, Tools and Methodology for Evaluation of Multicultural Name Matching Systems
Keith J. Miller | Mark Arehart | Catherine Ball | John Polk | Alan Rubenstein | Kenneth Samuel | Elizabeth Schroeder | Eva Vecchi | Chris Wolf
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper describes a Name Matching Evaluation Laboratory that is a joint effort across multiple projects. The lab houses our evaluation infrastructure as well as multiple name matching engines and customized analytical tools. Included is an explanation of the methodology used by the lab to carry out evaluations. This methodology is based on standard information retrieval evaluation, which requires a carefully-constructed test data set. The paper describes how we created that test data set, including the “ground truth” used to score the systems’ performance. Descriptions and snapshots of the lab’s various tools are provided, as well as information on how the different tools are used throughout the evaluation process. By using this evaluation process, the lab has been able to identify strengths and weaknesses of different name matching engines. These findings have led the lab to an ongoing investigation into various techniques for combining results from multiple name matching engines to achieve optimal results, as well as into research on the more general problem of identity management and resolution.


pdf bib
Parallel Syntactic Annotation of Multiple Languages
Owen Rambow | Bonnie Dorr | David Farwell | Rebecca Green | Nizar Habash | Stephen Helmreich | Eduard Hovy | Lori Levin | Keith J. Miller | Teruko Mitamura | Florence Reeder | Advaith Siddharthan
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes an effort to investigate the incrementally deepening development of an interlingua notation, validated by human annotation of texts in English plus six languages. We begin with deep syntactic annotation, and in this paper present a series of annotation manuals for six different languages at the deep-syntactic level of representation. Many syntactic differences between languages are removed in the proposed syntactic annotation, making them useful resources for multilingual NLP projects with semantic components.

pdf bib
Formal v. Informal: Register-Differentiated Arabic MT Evaluation in the PLATO Paradigm
Keith J. Miller | Michelle Vanni
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Tasks performed on machine translation (MT) output are associated with input text types such as genre and topic. Predictive Linguistic Assessments of Translation Output, or PLATO, MT Evaluation (MTE) explores a predictive relationship between linguistic metrics and the information processing tasks reliably performable on output. PLATO assigns a linguistic signature, which cuts across the task-based and automated metric paradigms. Here we report on PLATO assessments of clarity, coherence, morphology, syntax, lexical robustness, name-rendering, and terminology in a comparison of Arabic MT engines in which register differentiates the input. With a team of 10 assessors employing eight linguistic tests, we analyzed the results of five systems’ processing of 10 input texts from two distinct linguistic registers: a total we analyzed 800 data sets. The analysis pointed to specific areas, such as general lexical robustness, where system performance was comparable on both types of input. Divergent performance, however, was observed on clarity and name-rendering assessments. These results suggest that, while systems may be considered reliable regardless of input register for the lexicon-dependent triage task, register may have an affect on the suitability of MT systems’ output for relevance judgment and information extraction tasks, which rely on clearness and proper named-entity rendering. Further, we show that the evaluation metrics incorporated in PLATO differentiate between MT systems’ performance on a text type for which they are presumably optimized and one on which they are not.

pdf bib
What‘s in a Name: Current Methods, Applications, and Evaluation in Multilingual Name Search and Matching
Sherri Condon | Keith Miller
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Tutorial Abstracts


pdf bib
Interlingual Annotation of Multilingual Text Corpora
Stephen Helmreich | David Farwell | Bonnie Dorr | Nizar Habash | Lori Levin | Teruko Mitamura | Florence Reeder | Keith Miller | Eduard Hovy | Owen Rambow | Advaith Siddharthan
Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004


pdf bib
Sharing Problems and Solutions for Machine Translation of Spoken and Written Interaction
Sherri Condon | Keith Miller
Proceedings of the ACL-02 Workshop on Speech-to-Speech Translation: Algorithms and Systems

pdf bib
Scaling the ISLE Framework: Use of Existing Corpus Resources for Validation of MT Evaluation Metrics across Languages
Michelle Vanni | Keith Miller
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)


pdf bib
Integrated Feasibility Experiment for Bio-Security: IFE-Bio, A TIDES Demonstration
Lynette Hirschman | Kris Concepcion | Laurie Damianos | David Day | John Delmore | Lisa Ferro | John Griffith | John Henderson | Jeff Kurtz | Inderjeet Mani | Scott Mardis | Tom McEntee | Keith Miller | Beverly Nunam | Jay Ponte | Florence Reeder | Ben Wellner | George Wilson | Alex Yeh
Proceedings of the First International Conference on Human Language Technology Research


pdf bib
An Architecture for Dialogue Management, Context Tracking, and Pragmatic Adaptation in Spoken Dialogue Systems
Susann LuperFoy | Dan Loehr | David Duff | Keith Miller | Florence Reeder | Lisa Harper
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
A Multi-Neuro Tagger Using Variable Lengths of Contexts
Susann LuperFoy | Dan Loehr | David Duff | Keith Miller | Florence Reeder | Lisa Harper | Qing Ma | Hitoshi Isahara
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2