Cooperative Governance Preferences for Open Knowledge Sharing in the Age of Generative AI

Many bodies are considering how knowledge can be shared in an open manner while asserting some rules about how it can be used in the training of generative AI models by third parties. Approaches include denying permission to use open knowledge resources for AI training or to require trained models themselves to be made open access [1]. In academic research there is a strong ethos and existing policy to make outputs openly available. Generative AI products are increasingly using open access research papers and data to support functions such as literature review, ideation, experiment design and data analysis. However there are concerns that AI mediation of important research steps may lead to inaccuracies and errors, bias in suggesting certain concepts over others and inaccurate or missing attributions [2]. Academic research is therefore an example of a field where use of open access resources for training is welcome, but for which there are very specific quality assurance requirements which are needed to avoid undermining existing systems of knowledge generation and consumption. This project will explore options for defining machine readable rules that could accompany collections of published research work and to track whether those rules are adhered to in the training and use of GenAI. This could for example be applied to an institutional repository, such as TCD TARA, and enforcement of the rule would be integrated into a version of an academic support AI operated by TCD. This also offers opportunities to operate the resource and accompanying AI as a data space, using the technical supports for data sharing spaces [3] and the legal provisions of the Data Governance Act  [4] to enable democratic oversight, via a data cooperative [5], e.g. so that rules can be changed collectively over time by those contributing to the published knowledge.

What you will learn: This project will involve developed web integration skills, semantic web and access control languages, as well as an understanding of the changing legal framework for data sharing and trusted data intermediaries outlined in the EU’s new data governance act and IP governance for GenAI.

[1] https://creativecommons.org/ai-and-the-commons/cc-signals/ 

[2] https://european-research-area.ec.europa.eu/news/living-guidelines-responsible-use-generative-ai-research-published 

[3] https://dl.acm.org/doi/10.1007/978-3-030-62466-8_12   

[4] https://publications.jrc.ec.europa.eu/repository/handle/JRC133988

[5] https://data.coop/en/