AISoLA 2025

Bridging the Gap Between AI and Reality • Rhodes, Greece

Talk

An Interpretable Technique for Handwritten Character Recognition in Historical Documents

Time: Wednesday, 5.11

Room: Room C

Authors: Stefano Ferilli, Eleonora Bernasconi, Domenico Redavid

Abstract: While being considered a solved problem for contemporary printed documents, Optical Character Recognition may still be a challenging task if applied to historical documents, both for learning the models and for classifying new symbols. Also, for research purposes, features of the character shape may be relevant to better understand the document and for philological analysis. Finally, in ancient documents the number of examples available to learn models of characters may be extremely low and insufficient. This paper describes a novel approach to OCR in historical documents. It is a prototype-based approach requiring very few (often, just one) example of each character. It is interpretable, meaning that it can explicitly describe why a character was recognized, and what features of the character image were more relevant to the classification. It is incremental, meaning that the set of recognized characters can be expanded on-the-fly while processing a document, without requiring a global recomputation that starts from scratch. A discussion and demonstration of the proposed approach is also provided.