Page 130 - ITU Journal, Future and evolving technologies - Volume 1 (2020), Issue 1, Inaugural issue
P. 130

ITU Journal on Future and Evolving Technologies, Volume 1 (2020), Issue 1






                                                   GSMA
                                                P-NEST/S-NEST



             SMO                          RAN NSI ServiceProfile / NSSI SliceProfile:
                                           dlThptPerSlice, dlThptPerUe, termDensity


                             ML Training Host                         ML Inference Host & Actor
                 Trainer of RAN Cross-Slice Management Policies        RAN Cross-Slice Manager
                                            DQN Agent Slice 1              Policy Slice 1:
                                      r(1)               EvaluationNN        (1)
                           Training   s(1)  Evaluation Target   Experience  parameters ( 1 )  Obtain  a(1)
                                            NN
                                                NN
                         Environment
                                                    Dataset
               Synthethic Data /         Real Network Data  capacity  O-CU  O-CU  a(1)  parameters ( K )  state  Policy Slice K:     computation
                 Required
                                                                                        Resource
                 per tenant
                                                                                       usage quota
                 /cell
                                                         Evaluation NN
                                 O-DU
                         O-DU
                                      r(K)
                                                                             (K)
                                            DQN Agent Slice K
                                            NN
                       Cell 1 O-RU  Cell N O-RU  s(K)  Evaluation Target   Experience  Obtain  a(K)
                                                NN
                                                                      state
                                                    Dataset
                                      a(K)
                                                                                  (k,n)
                                                                 Performance                   RRMPolicyRatio
                                                             O1  measurements        O1
                                                                                         rRMPolicyDedicatedRatio (k,n)
                                                                                         per rRMPolicyMemberList (S-NSSAI)
               Management interfaces                        O-CU                O-CU
                Interactions between components
                                                            O-DU                O-DU
                Internal interactions inside a component
                                                            O-RU                O-RU
                                                     Cell 1             Cell N
                                    Fig. 5 – Deep Q Network-based cross-slice optimization solution
          proposed ML-based solution relies on Multi-Agent     process, as illustrated in Fig. 5.
          Reinforcement Learning (MARL) based on Deep Q-       The solution considered here assumes that the SLA
          Network (DQN) whose mathematical details can be      specification of the RAN slice requirements is done
          found in [33]. An important advantage of this multi-  based on three ServiceProfile parameters explained
          agent  scheme  is  that  it  uses  slice-specific  DQN   in  Section  3,  namely  dlThptPerSlice,  dlThptPerUe
          agents and action selection policies for the training   and  termDensity,  which  are  directly  derived  from
          and inference processes and, therefore, slices can be   the GSMA GST template and used as inputs for the
          easily  added/removed  in  the  scenario  just  by   different  solution  components  described  in  more
          adding  or  removing  the  corresponding  agent  and   detail  in  the  following  subsections.  The  specific
          action selection policy. Specifically, as seen in Fig. 5,   values  of  these  parameters  for  the  slice  k  are
          as  part  of  the  solution  there  is  a  Resource  Usage   denoted  as  dlThptPerSlice(k),  dlThptPerUe(k)  and
          Quota  Computation  module  that  determines  the    termDensity(k) .
                                                                              1
          values  of  (k,n)  based  on  the  outputs  obtained
          through the execution of K action selection policies   4.1  RAN cross-slice manager
          (k), each one associated to one slice. Each one of   This component includes the inference part of the
          these  policies  is  specified  through  a  deep  neural   DQN model and the functions needed to configure
          network (NN) defined by a vector of parameters k    the RAN nodes through the O1 interface (i.e. it takes
          that have been previously learnt during the training   the  ML  inference  and  actor  roles  of  the  O-RAN


          1  For  the  interested  reader,  the  parameters  of  the  algorithm
          described in [33] are related to the considered Service Profile
          attributes as follows: Scenario Aggregated Guaranteed Bit Rate
          (SAGBR)=dlThptPerSlice;  Maximum  Cell  Bit  Rate  (MCBR)  of
          slice k in cell n: MCBR(k,n)=dlThptPerUe(k)  termDensity(k) 
          cell n service area.




          110                                © International Telecommunication Union, 2020
   125   126   127   128   129   130   131   132   133   134   135