Daniel Whyatt

Also published as: Dan Whyatt


pdf bib
A Novel Approach to Part Name Discovery in Noisy Text
Nobal Bikram Niraula | Daniel Whyatt | Anne Kao
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

As a specialized example of information extraction, part name extraction is an area that presents unique challenges. Part names are typically multi-word terms longer than two words. There is little consistency in how terms are described in noisy free text, with variations spawned by typos, ad hoc abbreviations, acronyms, and incomplete names. This makes search and analyses of parts in these data extremely challenging. In this paper, we present our algorithm, PANDA (Part Name Discovery Analytics), based on a unique method that exploits statistical, linguistic and machine learning techniques to discover part names in noisy text such as that in manufacturing quality documentation, supply chain management records, service communication logs, and maintenance reports. Experiments show that PANDA is scalable and outperforms existing techniques significantly.


pdf bib
The IUCL+ System: Word-Level Language Identification via Extended Markov Models
Levi King | Eric Baucom | Timur Gilmanov | Sandra Kübler | Dan Whyatt | Wolfgang Maier | Paul Rodrigues
Proceedings of the First Workshop on Computational Approaches to Code Switching

pdf bib
Parsing German: How Much Morphology Do We Need?
Wolfgang Maier | Sandra Kübler | Daniel Dakota | Daniel Whyatt
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages