The UJIpenchars Database: a Pen-Based Database of Isolated Handwritten Characters

D. Llorens, F. Prat, A. Marzal, J. M. Vilar, M. J. Castro, J. C. Amengual, S. Barrachina, A. Castellanos, S. España, J. A. Gómez, J. Gorbe, A. Gordo, V. Palazón, G. Peris, R. Ramos-Garijo, F. Zamora


Abstract
The availability of large amounts of data is a fundamental prerequisite for building handwriting recognition systems. Any system needs a test set of labelled samples for measuring its performance along its development and guiding it. Moreover, there are systems that need additional samples for learning the recognition task they have to cope with later, i.e. a training set. Thus, the acquisition and distribution of standard databases has become an important issue in the handwriting recognition research community. Examples of widely used databases in the online domain are UNIPEN, IRONOFF, and Pendigits. This paper describes the current state of our own database, UJIpenchars, whose first version contains online representations of 1,364 isolated handwritten characters produced by 11 writers and is freely available at the UCI Machine Learning Repository. Moreover, we have recently concluded a second acquisition phase, totalling more than 11,000 samples from 60 writers to be made available in short as UJIpenchars2.
Anthology ID:
L08-1467
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/658_paper.pdf
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/658_paper.pdf