<p dir="ltr">Effectively transferring knowledge from one task to another in cooperative Multi-Agent Reinforcement Learning (MARL) can be key to accelerate learning in complex scenarios. For instance, in sports it is common to train in simpler situations and then apply what was learned in the real game. The same logic applies to other scenarios that involve increasing levels of difficulty, such as robotics tasks, or healthcare applications. In the realm of MARL, transferring knowledge can become extremely challenging due to factors such as changes in the observation and action spaces, or the underlying dynamics of the environment. Consequently, most current methods still opt to train each new task from scratch. However, as tasks become increasingly complex, there is a growing interest in leveraging behavioural similarities that emerge among semantically similar tasks. In this context, we explore techniques to facilitate the rapid transfer of knowledge from one policy network to another within off-policy value-based methods. Additionally, we introduce a special case that enables function-preserving transfers of centralised functions between tasks. Our work offers a promising strategy to reduce training time, enabling zero-shot task transfers.</p>
Funding
ATRACT: A Trustworthy Robotic Autonomous system to support Casualty Triage
Engineering and Physical Sciences Research Council
This accepted manuscript has been made available under the Creative Commons Attribution licence (CC BY) under the IEEE JISC UK green open access agreement.