Document Analysis
Abstract
The proposal revolves around several machine learning projects in the field of historical document analysis. We currently envision the following projects throughout the duration phase of this application:
1/ Extended Detection and Recognition of Ancient Egyptian Characters. Based on previous research results, we plan to develop an extended framework for the recognition of ancient Egyptian hieratic characters that provides character candidates from documents, offers few-shot recognition and novelty detection, and enables the integration of human expertise to restructure and correct class definitions.
2/ Text Line Segmentation and HTR on Complex Layouts. In handwritten documents the special structure (e.g., tables, curved and vertical lines of text) and layout (e.g., multi-column texts) can make it difficult to identify the exact position of lines of text. We aim to develop models suitable for a range of complex layouts with high performance in subsequent text recognition tasks as well as the training of foundation models that provide zero-shot capability on new datasets without the need for further fine-tuning.
3/ Open-set Text Recognition addresses transcribing text from image where completely unknown character may appear in the image. The model is expected to raise anomalies upon such occurrence to trigger human intervention, then allow the user to gain recognition capability by register a template to the model without re-training or finetuning. This technology promotes inclusiveness and enables recognizing historical and modern scripts of extremely low resources. In this work, we also plan to further scale this task beyond the scope of text.