Ground-truth on Zenodo

For the evalution of this year’s competition, we have established the training and test data sets, with published the full ground-truth for the validation and test sets. The complete dataset (validation + test) is publicly available at the following address:

The training set is described in S. Fiel, F. Kleber, M. Diem, V. Christlein, G. Louloudis, S. Nikos, and B. Gatos, “Icdar2017 competition on historical document writer identification (historical-wi),” in 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, Nov 2017, pp. 1377–1382.

The validation set encompasses 1200 images and the test set 20000 images.

The provenance and distribution of the test set is as described in the following chart.

Test data set online

The test data set contains 20,000 images with different, i.e. 1-5 samples per writer . The corpus can be downloaded at the following link:

Update (23/04/2019): same corpus in higher resolution images [25 Gb]

The new deadline for submitting the results is on 2 May 2019.

For further info, go to Timeline and Dataset.

Image « 0.jpg »: first of 20’000 images in the test data-set

Additional validation set

The researchers and teams wishing to prepare the ICDAR-2019-HDRC-IR competition can now validate with an additional corpus. It encompasses handwritten letters as well as book scripts of the Middel Ages and 16th century.

The corpus can be downloaded at the following link:

It contains 300 writers contributing 1 page, 100 writers contributing 3 
pages, and 120 writers contributing 5 pages resulting in 1200 images of 
520 writers.