From the Physical to the Digital Library

Creation and Application of a HTR Model for the Transcription of Giono’s Annotated Books

  • Virginia Melotto Università degli Studi di Torino


As part of the author’s archive, Giono’s physical library is a source of extratextual information that should be taken into account for the interpretation of his late novels, particularly in regards to his sociopolitical views. The volumes containing the reading marks represents, in fact, an intertextual context for the novels, and the presence of numerous political works contrasts with the authorial image associated with a disengagement starting from the end of Second World War. Within the digitalization pipeline, which aims at the publication of the digital edition on the web of a selected number of political texts, the present article focusses on the extraction of machine-readable text from the image files, describing how the transcription process is carried out automatically by the creation and application of a HTR model with Transkribus. We will provide a description of the ground-truth material inserted, the parameters set and the training of the model, the results of multiple trainings as well as examples of the transcriptions. The resulting model is ready to be used for future transcriptions, enabling the efficient digitalization of a great number of volumes from the author’s library as well as other documents from his archive.

How to Cite
Melotto, V. (2023). From the Physical to the Digital Library: Creation and Application of a HTR Model for the Transcription of Giono’s Annotated Books. RiCOGNIZIONI. Rivista Di Lingue E Letterature Straniere E Culture Moderne, 10(19).