Outputs and Evaluation - Classification of Medieval Handwritings in Latin Script

Contents

1 Expected outputs for the Competition
- 1.1 Executable file
- 1.2 Input/Output
2 3.2 Evaluation
- 2.1 Task1
- 2.2 Task 2

Expected outputs for the Competition

For the competition, it was mandatory to deliver the executable files.

Executable file

Participants will provide

an executable file
a description of the required environment, resources and the expected processing time for 1000 images.

Input/Output

The executable file should:

read the images from a folder
produce
- a “ symmetrical distance matrix” of the images present in the folder (for task 1 as well as for task 2). The sum of all entries of this matrix will be equal to 1.0.
- a CSV file with 2 columns: “FILENAME, SCRIPT_TYPE ” for task 1 (SCRIPT_TYPE, an integer between 1 and 12, indicating the belonging class).
- a CSV file with 13 columns: FILENAME,SCRIPT_TYPE1, …., SCRIPT_TYPE12” for task 2 (SCRIPT_TYPE1, …., SCRIPT_TYPE12 are real values, and each row has a sum of its values equal to 1.0). This information will be named Belonging Matrix in the following sections.

Nota: Participants are allowed to submit several independent proposals

3.2 Evaluation

Based on the test data-set, the evaluation will be given as follow:

Accuracy per script type
Global accuracy for fuzzy results
Normalized distance matrix analysis
Processing time

For each task, two rankings will be done. The first one will be based on the average global accuracy, the second based on the symmetrical distance matrix.

Task1

The “accuracy per script type” is given according to the ground truth, which has one label for each script image in the evaluation data set. The ranking will be based on the average accuracy.

As far as the distance matrix is concerned, the evaluation will be done on the average intraclasse distance that will yield to the second ranking.

Task 2

The “global accuracy for fuzzy results” will be evaluated as follow:

The ground truth indicates one or two labels for each script image, from the Belonging Matrix, only the two highest membership degrees will define SCRIPT_TYPE1 and SCRIPT_TYPE2.
Scores will be attributed as follow:
- 4 points if SCRIPT_TYPE1 AND SCRIPT_TYPE2 match the labels given to the image in the ground truth.
- 2 points if SCRIPT_TYPE1 matches one of the labels given to the image in the ground truth, but not SCRIPT_TYPE2.
- 1 point if SCRIPT_TYPE2 matches one label given to the image in the ground truth, but not SCRIPT_TYPE1.
- (-2) points if SCRIPT_TYPE1 AND SCRIPT_TYPE2 do not match any of the labels given to the image in the ground truth.

The ranking is based on the average accuracy.

As far as the distance matrix is concerned, for each image, according to the ground truth, we consider the one or two classes with highest membership degree. Then the sum of distances to the images of the same ground truth classes will be computed and averaged all over the test base. The ranking will be done with respect to this global value.