Approaching realism in social dilemmas: moody reinforcement learners can conditionally replicate human models

Feehan, Grace

doi:10.26174/thesis.lboro.24556471.v1

Approaching Realism In Social Dilemmas Doctoral Thesis - Grace Feehan.pdf (3.73 MB)

Approaching realism in social dilemmas: moody reinforcement learners can conditionally replicate human models

thesis

posted on 2023-11-16, 15:40 authored by Grace Feehan

More obscure components of affect - such as mood - are frequently undervalued in artificial agent modelling, despite demonstrative evidence that they may have an influence on human social processes. Artificial mood in particular has been shown to increase proportions of prosocial behaviours when combined with typically self-focused machine learning algorithms; though, the research at this stage is fairly preliminary and has much scope for expansion. Social dilemmas provide a simplified and succinct domain in which to test such mechanisms whilst adding novel experimental elements from human research. Utilising the Iterated Prisoner’s Dilemma, we evaluate a promising existing model of moody reinforcement learning technique across a broad range of network structures and environmental manipulations. With the end goal of identifying flaws in the model and suggesting improvements, we first review what it means to model such human structures, what psychological research tells us about such structures to begin with, and alternative existing models to the one deployed here. Through these, we design a formal framework of critique and analysis to thoroughly review the existing state of the algorithm. Then, we present three clusters of quantitative experiments across two custom-designed multi-agent network simulations - static and dynamic - with a number of additional experimental factors taken from prior agent work and existing human psychological research. These factors include manipulation of interaction structure, the payoff matrix used, proportions of game-playing strategies present, the ability to reject game partners, the method by which we evaluate game partners for this rejection, and variables controlling the restructuring of the network itself. Overall, we find that all but one of these factors provides methods by which we can further enhance the algorithm’s naturally cooperative nature. In particular, the most elucidating aspects in regards to modelling human behavioural trends are the algorithm’s natural reactivity to summary variables of the payoff matrix structure - the Cooperation Index - and certain dynamic network restructuring factors, both of which are completely novel to literature on this algorithm. The latter of these two is only facilitated through one particular partner-rejection evaluation strategy, despite its simple structure, and incentivises scope for further research. Lastly, we provide in-depth analysis of the algorithm’s strengths and flaws, with clear outlines for judging its successes and recommendations for its next stages of development. Combining these, we hope to add to the growth of this particular computational affect model in new ways.

History

School

Science

Department

Computer Science

Publisher

Loughborough University

Rights holder

Publication date

2023

Notes

A doctoral thesis submitted in partial fulfilment of the requirements for the award of doctor of philosophy of Loughborough University

Language

en

Supervisor(s)

Shaheen Fatima

Qualification name

PhD

Qualification level

Doctoral

Loughborough Email address

g.feehan@lboro.ac.uk

This submission includes a signed certificate in addition to the thesis file(s)

I have submitted a signed certificate

Student ID number

B833348

Administrator link

https://repository.lboro.ac.uk/account/articles/24556486

Usage metrics

Keywords

reinforcement learning multiagent networks mood emotion psychology SARSA dynamic networks static networks affect iterated prisoners dilemma cooperation index

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Approaching realism in social dilemmas: moody reinforcement learners can conditionally replicate human models

History

School

Department

Publisher

Rights holder

Publication date

Notes

Language

Supervisor(s)

Qualification name

Qualification level

Loughborough Email address

This submission includes a signed certificate in addition to the thesis file(s)

Student ID number

Administrator link

Usage metrics

Categories

Keywords

Licence

Exports