Accepted paper.pdf (247.28 kB)

Risk-aware multi-armed bandits with refined upper confidence bounds

Download (247.28 kB)
journal contribution
posted on 07.01.2021, 11:52 by Xingchi Liu, Mahsa Derakhshani, Sangarapillai Lambotharan, Mihaela Van der Schaar
The classical multi-armed bandit (MAB) framework studies the exploration-exploitation dilemma of the decisionmaking problem and always treats the arm with the highest expected reward as the optimal choice. However, in some applications, an arm with a high expected reward can be risky to play if the variance is high. Hence, the variation of the reward should be considered to make the arm-selection process risk-aware. In this paper, the mean-variance metric is investigated to measure the uncertainty of the received rewards. We first study a risk-aware MAB problem when the reward follows a Gaussian distribution, and a concentration inequality on the variance is developed to design a Gaussian risk aware-upper confidence bound algorithm. Furthermore, we extend this algorithm to a novel asymptotic risk aware-upper confidence bound algorithm by developing an upper confidence bound of the variance based on the asymptotic distribution of the sample variance. Theoretical analysis proves that both proposed algorithms achieve the O(log(T)) regret. Finally, numerical results demonstrate that our algorithms outperform several risk-aware MAB algorithms.

Funding

Royal Academy of Engineering under the Leverhulme Trust Research Fellowship scheme (DerakhshaniLTRF1920\16\67)

History

School

  • Mechanical, Electrical and Manufacturing Engineering

Published in

IEEE Signal Processing Letters

Volume

28

Pages

269 - 273

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Version

AM (Accepted Manuscript)

Rights holder

© IEEE

Publisher statement

Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Acceptance date

12/12/2020

Publication date

2020-12-28

Copyright date

2021

ISSN

1070-9908

eISSN

1558-2361

Language

en

Depositor

Dr Mahsa Derakhshani. Deposit date: 4 January 2021

Exports