Edge Swarms: Learning Cooperative Behaviour in Distributed AI Systems

Background

Centralised placement controllers are simple to reason about but brittle in practice: a single point of failure, limited scalability, and high communication overhead as the network grows. Decentralised approaches — where each edge node makes its own decisions — are more resilient, but risk sub-optimal global outcomes when nodes act purely in self-interest.

Multi-Agent Reinforcement Learning (MARL) offers a principled framework for training a population of edge agents to collaborate. Each agent observes its local state (load, queue depth, neighbour connectivity), takes placement actions, and receives rewards that reflect both local performance and system-wide efficiency. Over time, agents learn policies that are individually rational and collectively optimal — without requiring a central coordinator.

Research Challenge

As AI systems evolve towards networks of autonomous agents, centralised control architectures are increasingly becoming a bottleneck. Future edge computing environments may contain hundreds or thousands of distributed nodes supporting AI-powered services for transportation, healthcare, public safety, environmental monitoring and digital citizen services. Coordinating these resources through a single controller introduces scalability limitations, communication overheads and potential single points of failure.

This project investigates how Multi-Agent Reinforcement Learning (MARL) can be used to enable decentralised service orchestration across large-scale edge infrastructures. In this approach, each edge node acts as an autonomous agent that makes local decisions regarding service placement, migration and resource allocation while simultaneously learning to cooperate with neighbouring agents to optimise overall system performance.

A key challenge is balancing local autonomy with global objectives. Agents must learn when to cooperate, what information to share, and how to coordinate decisions in environments characterised by incomplete information, dynamic workloads, communication delays and infrastructure failures. The research will explore challenges such as reward design, cooperative decision-making, scalability, resilience, agent communication and adaptation to changing network conditions.

Example application domains include smart-city infrastructure management, autonomous transportation systems, distributed AI agent platforms, collaborative robotics, edge-hosted digital assistants and large-scale Internet of Things (IoT) environments. The work will involve developing and evaluating MARL-based orchestration strategies through simulation and experimentation, comparing their performance against traditional centralised and heuristic-based approaches.

Topics

Multi-Agent Reinforcement Learning (MARL); Autonomous AI Agents; Cooperative Decision-Making; Decentralised Service Orchestration; Distributed Artificial Intelligence; Edge Computing and Edge AI; Resource Allocation and Optimisation; Agent Communication Protocols; Resilient and Adaptive Systems

Impact

Many researchers believe that the future of AI lies not in individual models, but in large populations of specialised agents that collaborate to solve complex tasks. From autonomous transportation systems and smart energy networks to distributed robotics and next-generation digital assistants, these systems will require new approaches to coordination that can operate without constant central oversight.

This project addresses one of the fundamental challenges of agentic AI: how autonomous agents can learn to cooperate effectively while pursuing shared objectives. The outcomes could contribute to the development of more scalable, resilient and intelligent distributed systems capable of supporting the next generation of AI-enabled infrastructure. More broadly, the project provides insight into how large populations of AI agents can collectively make decisions in complex, dynamic and resource-constrained environments.