reinforcement learning from LLM feedback

1 September 2025 Ivana Dusparic (email | all projects)

This project aims to work on techniques for fine tuning LLMs to act as a source of a reward in a reinforcement learning system – whether to replace or complement standard RL rewards, or to act as a source of alignment of an RL process with human preference.

The project is suitable for an MSc level student and prior RL and/or LLM experience is essential.

Post navigation

Self-Coordination in Multi-Agent Reinforcement Learning Applied to Railway Domain

Ethics Tracker for AI Research Projects

Subject Areas

3D graphics AI Artificial Intelligence augmented and virtual reality Augmented Reality CAVs climate change Communication Protocols Computer Animation computer graphics Computer Vision Connected Autonomous Vehicles Data Analytics Dialogue Education electroencephalography fairness Federated learning graphics and vision HTTP3 Human Computer Interaction Information-Centric Networking Intelligent Transportation Systems language LLM Machine Learning Modelling and Simulation networking Networks and Telecommunications neural data neural networks Quantum Computing Reinforcement Learning Security/Privacy Signal processing software testing Speech Statistics Statistics and Sustainability Sustainability technology & learning technology and learning transformers virtual humans virtual reality

Supervisors

Login
Add Project
Edit Projects

Scroll back to top