This paper is devoted to autonomous device-to-device (D2D) communication in cellular networks. The aim of each D2D pair is to maximize its throughput subject to the minimum signal-to-interference-plus-noise ratio (SINR) constraints. This problem is represented by a stochastic non-cooperative game where the players (D2D pairs) have no prior information on the availability and quality of selected channels. Therefore, each player in this game becomes a 'learner' which explores all of its possible strategies based on the locally-observed throughput and state (defined by the channel quality). Consequently, we propose a multi-agent Q-learning algorithm based on the players' 'beliefs' about the strategies of their counterparts and show its implementation in a Long Term Evolution - Advanced (LTE-A) network. As follows from simulations, the algorithm achieves a near-optimal performance after a small number of iterations.
History
School
Science
Department
Computer Science
Published in
2016 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)
Source
2016 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)