ICFHR2016 Competition on the Classification of Medieval Handwritings in Latin Script

As part of ICFHR 2016 (http://www.nlpr.ia.ac.cn/icfhr2016/competitions.htm), the ICFHR2016 Competition on the Classification of Medieval Handwritings in Latin Script is a competition on Script Classification.
The original announcement appeared at the following URL: https://oriflamms.hypotheses.org/1388.

Task and context

1 Tasks under evaluation

10-Semihybrida (c) IRHT-CIPL
10-Semihybrida (c) IRHT-CIPL

The task to be evaluated in the present competition is the classification of 1000 images of Latin Scripts, from handwritten books dated 500 C.E. to 1600 C.E.

The organizers provided a training data-set consisting in 2000 images of well defined script types. The results are evaluated on images which are not included in the training data-set: a set of 1000 images for task 1 or/and on a set of 2000 images of mixed script type images for task 2.

The complete data-set is thereafter named CLaMM : classification of Latin Medieval Manuscripts

There are 12 pre-defined classes in CLaMM according to the script style.

We propose two possible tasks: task 1 named “Crisp Classification” and task 2 named “Fuzzy Classification”.

Both tasks are independent, and participants have to announce if they want to perform task 1, task 2, or tasks 1 and 2.

  • Task 1 is divided in two steps:
    • The participants will have to provide a “distance matrix” between pairs of images
    • The participants will have to associate a single label to each image
  • Task 2 is also divided in two steps
    • The participants will have to provide a “distance matrix” between pairs of image
    • The participants will have to associate a multi-weighted labeling to each image

Participants are expected to provide the executable files, capable of producing the results of steps 1 and 2 of the respective tasks according to the format that is described in section 2.
In this competition, in task 1 the training data-set and the test data-set encompass well defined script types, in order to make the evaluation possible. In task 2, the test data-set also encompasses mixed script types, which illustrates evolution of Latin scripts.

Related topics and previous work

The present competition on the Classification of Medieval Handwritings in Latin Script is related but differs from:

  • Segmentation and text detection on an image;
  • Binarization;
  • Image Feature Extraction;
  • Sorting out different scripts (Latin / Arabic / Greek / Hebrew, etc.);
  • Performing scribal identification within a homogenous corpus or within a particular manuscript.

The latter topic is the closest and has been dealt with by numerous competitions and publications [2]–[4].

As for the Classification of Medieval Handwritings in Latin Script specifically: the first attempt at automating the classification of medieval Latin scripts was made by the Graphem research project (Grapheme based Retrieval and Analysis for PalaeograpHic Expertise of medieval Manuscripts) funded by the French National Research Agency (ANR-07-MDCO-006, 2007-2011). The results are published in [5], [6].

Further research has been conducted on a theoretical level by one of the organizers and several teams in Computer Science[7]–[15]. Nevertheless none of the teams had access to the labelled data-set and the latter has not been made available anywhere.


Works Cited

[1] R. Niels and L. Vuurpijl, “Generating Copybooks from Consistent Handwriting Styles,” in Ninth International Conference on Document Analysis and Recognition, 2007. ICDAR 2007, 2007, vol. 2, pp. 1009–1013.

[2] L. R. B. Schomaker, K. Franke, and M. L. Bulacu, “Using codebooks of fragmented connected-component contours in forensic and historic writer identification,” Pattern Recognition Letters, vol. 28, no. 6, pp. 719–727, 2007.

[3] A. A. Brink, J. Smit, M. L. Bulacu, and L. R. B. Schomaker, “Writer identification using directional ink-trace width measurements,” Pattern Recognition, vol. 45, no. 1, pp. 162–171, 2012.

[4] S. He, M. Wiering, and L. R. B. Schomaker, “Junction detection in handwritten documents and its application to writer identification,” Pattern Recognition, vol. 48, pp. 4036–4048, 2015.

[5] D. Muzerelle and M. Gurrado, Eds., Analyse d’image et paléographie systématique : travaux du programme “Graphem” : communications présentées au colloque international “Paléographie fondamentale, paléographie expérimentale : l’écriture entre histoire et science” (Institut de recherche et d’histoire des textes (CNRS), Paris, 14-15 avril 2011). Paris: Association Gazette du livre médiéval, 2011.

[6] D. Stutzmann and M. Gurrado, “Mesure et histoire des écritures médiévales,” in Mesure et histoire médiévale, Actes du XLIIIe Congrès de la SHMESP, Paris: Publications de la Sorbonne, 2013, pp. 153–166.

[7] N. Vincent, A. Seropian, and G. Stamon, “Synthesis for handwriting analysis,” Pattern Recognition Letters, vol. 26, no. 3, pp. 267–275, 2005.

[8] G. Joutel, V. Eglin, and H. Emptoz, “Generic scale-space process for handwriting documents analysis,” in 19th International Conference on Pattern Recognition, 2008. ICPR 2008, 2008, pp. 1–4.

[9] I. Siddiqi, F. Cloppet, and N. Vincent, “Contour Based Features for the Classification of Ancient Manuscripts,” presented at the 14th Conference of the International Graphonolics Society, (IGS), Dijon, 2009.

[10] G. Joutel, V. Eglin, and H. Emptoz, “Generic Scale-Space Architecture for Handwriting Documents Analysis, chapter 15,” in Pattern Recognition Recent Advances, A. Herout, Ed. InTech, 2010, pp. 293–312.

[11] F. Cloppet, H. Daher, V. Églin, H. Emptoz, M. Exbrayat, G. Joutel, F. Lebourgeois, L. Martin, I. Moalla, I. Siddiqi, and N. Vincent, “New Tools for Exploring, Analysing and Categorising Medieval Scripts,” Digital Medievalist, no. 7, 2011.

[12] H. Daher, V. Églin, S. Brès, and N. Vincent, “Étude de la dynamique des écritures médiévales. Analyse et classification des formes écrites,” Gazette du livre médiéval, vol. 56–57, pp. 21–41, 2011.

[13] I. Siddiqi, F. Cloppet, and N. Vincent, “Writing property descriptors. A proposal for typological groupings,” Gazette du livre médiéval, vol. 56–57, pp. 42–57, 2011.

[14] V. Eglin, D. Gaceb, H. Daher, S. Bres, and N. Vincent, “Outils d’analyse de la dynamique des écritures médiévales pour l’aide à l’expertise paléographique,” Document Numérique, vol. 41, no. 1, pp. 81–104, 2011.

[15] D. Stutzmann, “Clustering of medieval scripts through computer image analysis: towards an evaluation protocol,” Digital Medievalist, vol. 10, 2015