[ALLOCATED] 24/25 Project: Using advanced NLP techniques to support annotation of transcribed historical documents

With tools such as Transkribus (https://www.transkribus.org/) and eScriptorium (https://www.sofer.info/), AI-based image processing techniques are unblocking what has been a hugely expensive and time-intensive process of turning historical documents into machine readable transcriptions, in a manner that enables a scaling up of the process.The next problem that then needs to be tackled is how to support historians to interrogate these transcriptions in a manner that supports their research and analysis. This typically requires tools that will support the automatic tagging of texts with focus on particular terms that underpin a particular enquiry. For example, in virtualtreasury.ie historians are interested in tagging people, places, organisations, whereas researchers in voicesproject.ie are interested in females in particular, violent events, property related events etc. This project will explore whether a tool can be put in place that will historians themselves configure the NLP techniques (including LLM based techniques if required) that drive such tagging within transcribed historical documents.

This project provides a rich opportunity both to research current state of art approaches to user interface development and NLP/LLM based techniques

KEYWORDS:User Interface, NLP/LLM based techniques

PREREQUISITES: No prerequisite per se, but definitely helpful to have interest/experience in NLP/LLM based techniques and web-based User Interface design/development