Page 203 - Kaleidoscope Academic Conference Proceedings 2021
P. 203

REINFORCEMENT LEARNING FOR SCHEDULING AND MIMO BEAM SELECTION
                                           USING CAVIAR SIMULATIONS

                                                     1
                                                                                 1
                                 1
           João Paulo Tavares Borges ; Ailton Pinto de Oliveira ; Felipe Henrique Bastos e Bastos ; Daniel Takashi Né do Nascimento
                                                                                                             3
                                              1
                                                                                       1
                1
                                                                 2
           Suzuki ; Emerson Santos de Oliveira Junior ; Lucas Matni Bezerra ; Cleverson Veloso Nahum ; Pedro dos Santos Batista ;
                                           Aldebaro Barreto da Rocha Klautau Júnior 1
                                      1
                                       Universidade Federal do Pará, Belém 66075-110, Brazil
                                       2
                                       Universidade Estácio de Sá, Belém 66055-260, Brazil
                                         3 Ericsson Research, 164 80 Stockholm, Sweden
                              ABSTRACT
           This  paper  describes  a  framework  for  research  on
           Reinforcement  Learning  (RL)  applied  to  scheduling  and
           MIMO  beam  selection.  This  framework  consists  of  asking
           the  RL  agent  to  schedule  a  user  and  then  choose  the  index                    UAV
           of a beamforming codebook to serve it.  A key aspect of this
           problem is that the simulation of the communication system
           and  the  artificial  intelligence  engine  is  based  on  a  virtual
           world  created  with  AirSim  and  the  Unreal  Engine.  These
           components  enable  the  so-called  CAVIAR  methodology,
           which  leads  to  highly  realistic  3D  scenarios.  This  paper
           describes  the  communication  and  RL  modeling  adopted  in
           the  framework  and  also  presents  statistics  concerning  the
           implemented RL environment, such as data traffic, as well as
           results for three baseline systems.                Figure  1  –  CAVIAR  simulation  scenario,  depicting  the
                                                              radiation pattern (in light green) corresponding to the chosen
           Keywords - 5G, 6G, beam selection, MIMO, mmWave, RL
                                                              beamforming codebook index to serve a drone (at the right).
                         1. INTRODUCTION                      Systems  such  as  IEEE  802.11ad  are  usually  designed  for
                                                              worst-case  scenarios  and,  in  most  situations,  continuously
           Reinforcement Learning (RL) is a learning paradigm suitable   send  signals  that  do  not  carry  information  (overhead)  [9].
           for  problems  in  which  an  agent  has  to  maximize  a  given   This  overhead  may  represent  a  significant  p arcel  o f  the
           reward, while interacting with an ever-changing environment.   channel capacity, and decreasing it is a fundamental problem
           This  class  of  problem  appears  in  several  points  of  interest   that  can  enable  systems  to  improve  the  usage  of  physical
           in  5th  Generation  (5G)  and  6th  Generation  (6G)  mobile   resources  (e.g.,  with  lower  latency  and  higher  bit  rates)
           networks, such as: congestion control [1], network slicing [2],   [10, 11, 12].
           resource allocation [3], and the 5G Physical Layer (PHY) [4].
           However, the lack of freely available data sets or environments   In  this  work,  the  beam  selection  and  user  scheduling
           to train and assess RL agents is a practical obstacle that delays   problems  are  posed  as  a  game  that  must  be  solved  with
           the widespread adoption of RL in 5G and future networks.  RL.   The  game  is  based  on  a  simulation  methodology
                                                              named Communication Networks, Artificial Intelligence and
           To  address  this  challenge,  some  works  explore  the  use  of   Computer  Vision  with  3D  Computer-Generated  Imagery
           virtual worlds to generate data sets by creating environments   (CAVIAR),  with  a  preliminary  version  proposed  in  [13].
           for communications in general [5], and Artificial Intelligence   The  CAVIAR  simulation  integrates  three  subsystems:  the
           (AI)  /  Machine  Learning  (ML)  applied  to  5G/6G  [6],   communication system,  the AI and ML models,  and finally
           leveraging the fact that 5G and beyond systems will benefit   the virtual world components.  In this paper, the problem is
           from  rich  contextual  information  to  improve  performance   based  on  simulating  a  communication  system  immersed  in
           and  reduce  loss  of  radio  resources  to  support  its  services   a virtual world created with AirSim [14] and Unreal Engine [15].
           [4,  7,  8].  So,  the  key  idea  in  this  paper  is  to  use  realistic
           representations  of  deployment  sites  together  with  physics   More  specifically,  the  goal  is  to  schedule  and  allocate
           and  sensor  simulations,  to  generate  a  virtual  representation   resources  to  Unmanned  Aerial  Vehicles  (UAVs),  cars  and
           that  combined  with  the  communication  network  simulator,   pedestrians, composing a scenario with aerial and terrestrial
           enables training RL agents for tasks such as beam selection.




           978-92-61-33881-7/CFP2168P @ ITU 2021          – 141 –                                   Kaleidoscope
   198   199   200   201   202   203   204   205   206   207   208