25/26 PROJECT #5: Generative AI assisted synchronisation of knowledge graphs with evolving data source structures

Many knowledge graphs are not constructed from scratch, but rather are based on the ongoing uplift of data from existing data sources hosted in a variety of diverse data representations (relational data, JSON, CSV, XML etc.). The community has developed specifications that allow engineers who construct and maintain knowledge graphs to flexibly specify what data items are selected for uplift and how they are to be represented in the knowledge graph. The most popular specification (and associated technology) in use is RML (https://rml.io/specs/rml/).

Of course the problem is first not only how to decide what data items should be selected for uplift but also how they should be transformed to fit the knowledge graph schema and representation. In other words, the design of the mapping between the data source schemas and the knowledge graph schema. Especially in scenarios where the data source schemas or the knowledge graph schema evolves over time.

Naturally it is worth exploring how a generative AI approach to providing assistance to people in creating RML mappings, and this is the focus of this project.

This project is suitable as a Final Year Project or MSc Dissertation project, with the challenge scope and ambition being tailored accordingly.

PREREQUISITES: No prerequisite per se, but definitely helpful to have experience in data management and an interest in the use of generative AI to assist users engaged in technical development.