Loughborough University
Browse

Learning data-driven decision-making policies in multi-agent environments for autonomous systems

Download (572.02 kB)
journal contribution
posted on 2021-11-05, 13:58 authored by Joosep Hook, Seif El-Sedky, Varuna De-SilvaVaruna De-Silva, Ahmet Kondoz
Autonomous systems such as Connected Autonomous Vehicles (CAVs), assistive robots are set improve the way we live. Autonomous systems need to be equipped with capabilities to Reinforcement Learning (RL) is a type of machine learning where an agent learns by interacting with its environment through trial and error, which has gained significant interest from research community for its promise to efficiently learn decision making through abstraction of experiences. However, most of the control algorithms used today in current autonomous systems such as driverless vehicle prototypes or mobile robots are controlled through supervised learning methods or manually designed rule-based policies. Additionally, many emerging autonomous systems such as driverless cars, are set in a multi-agent environment, often with partial observability. Learning decision making policies in multi-agent environments is a challenging problem, because the environment is not stationary from the perspective of a learning agent, and hence the Markov properties assumed in single agent RL does not hold. This paper focuses on learning decision-making policies in multi-agent environments, both in cooperative settings with full observability and dynamic environments with partial observability. We present experiments in simple, yet effective, new multi-agent environments to simulate policy learning in scenarios that could be encountered by an autonomous navigating agent such as a CAV. The results illustrate how agents learn to cooperate in order to achieve their objectives successfully. Also, it was shown that in a partially observable setting, an agent was capable of learning to roam around its environment without colliding in the presence of obstacles and other moving agents. Finally, the paper discusses how data-driven multi agent policy learning can be extended to real-world environments by augmenting the intelligence of autonomous vehicles.

Funding

MIMIc: Multimodal Imitation Learning in MultI-Agent Environments

Engineering and Physical Sciences Research Council

Find out more...

History

School

  • Loughborough University London

Published in

Cognitive Systems Research

Volume

65

Pages

40 - 49

Publisher

Elsevier

Version

  • AM (Accepted Manuscript)

Rights holder

© Elsevier

Publisher statement

This paper was accepted for publication in the journal Cognitive Systems Research and the definitive published version is available at https://doi.org/10.1016/j.cogsys.2020.09.006

Acceptance date

2020-09-20

Publication date

2020-09-28

Copyright date

2021

ISSN

1389-0417

Language

  • en

Depositor

Dr Varuna De Silva. Deposit date: 4 November 2021

Usage metrics

    Loughborough Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC