C. Anton Rytting


pdf bib
Arabic Data Science Toolkit: An API for Arabic Language Feature Extraction
Paul Rodrigues | Valerie Novak | C. Anton Rytting | Julie Yelle | Jennifer Boutz
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


pdf bib
DECCA Repurposed: Detecting transcription inconsistencies without an orthographic standard
C. Anton Rytting | Julie Yelle
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages


pdf bib
ArCADE: An Arabic Corpus of Auditory Dictation Errors
C. Anton Rytting | Paul Rodrigues | Tim Buckwalter | Valerie Novak | Aric Bills | Noah H. Silbert | Mohini Madgavkar
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications


pdf bib
Typing Race Games as a Method to Create Spelling Error Corpora
Paul Rodrigues | C. Anton Rytting
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a method to elicit spelling error corpora using an online typing race game. After being tested for their native language, English-native participants were instructed to retype stimuli as quickly and as accurately as they could. The participants were informed that the system was keeping a score based on accuracy and speed, and that a high score would result in a position on a public scoreboard. Words were presented on the screen one at a time from a queue, and the queue was advanced by pressing the ENTER key following the stimulus. Responses were recorded and compared to the original stimuli. Responses that differed from the stimuli were considered a typographical or spelling error, and added to an error corpus. Collecting a corpus using a game offers several unique benefits. 1) A game attracts engaged participants, quickly. 2) The web-based delivery reduces the cost and decreases the time and effort of collecting the corpus. 3) Participants have fun. Spelling error corpora have been difficult and expensive to obtain for many languages and this research was performed to fill this gap. In order to evaluate the methodology, we compare our game data against three existing spelling corpora for English.


pdf bib
Error Correction for Arabic Dictionary Lookup
C. Anton Rytting | Paul Rodrigues | Tim Buckwalter | David Zajic | Bridget Hirsch | Jeff Carnes | Nathanael Lynn | Sarah Wayland | Chris Taylor | Jason White | Charles Blake III | Evelyn Browne | Corey Miller | Tristan Purvis
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We describe a new Arabic spelling correction system which is intended for use with electronic dictionary search by learners of Arabic. Unlike other spelling correction systems, this system does not depend on a corpus of attested student errors but on student- and teacher-generated ratings of confusable pairs of phonemes or letters. Separate error modules for keyboard mistypings, phonetic confusions, and dialectal confusions are combined to create a weighted finite-state transducer that calculates the likelihood that an input string could correspond to each citation form in a dictionary of Iraqi Arabic. Results are ranked by the estimated likelihood that a citation form could be misheard, mistyped, or mistranscribed for the input given by the user. To evaluate the system, we developed a noisy-channel model trained on studentsÂ’ speech errors and use it to perturb citation forms from a dictionary. We compare our system to a baseline based on Levenshtein distance and find that, when evaluated on single-error queries, our system performs 28% better than the baseline (overall MRR) and is twice as good at returning the correct dictionary form as the top-ranked result. We believe this to be the first spelling correction system designed for a spoken, colloquial dialect of Arabic.


pdf bib
A Cost-Benefit Analysis of Hybrid Phone-Manner Representations for ASR
Eric Fosler-Lussier | C. Anton Rytting
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing


pdf bib
Greek Word Segmentation Using Minimal Information
C. Anton Rytting
Proceedings of the Student Research Workshop at HLT-NAACL 2004

pdf bib
Segment Predictability as a Cue in Word Segmentation: Application to Modern Greek
C. Anton Rytting
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology