Page 130 - ITU Journal, Future and evolving technologies - Volume 1 (2020), Issue 1, Inaugural issue
P. 130
ITU Journal on Future and Evolving Technologies, Volume 1 (2020), Issue 1
GSMA
P-NEST/S-NEST
SMO RAN NSI ServiceProfile / NSSI SliceProfile:
dlThptPerSlice, dlThptPerUe, termDensity
ML Training Host ML Inference Host & Actor
Trainer of RAN Cross-Slice Management Policies RAN Cross-Slice Manager
DQN Agent Slice 1 Policy Slice 1:
r(1) EvaluationNN (1)
Training s(1) Evaluation Target Experience parameters ( 1 ) Obtain a(1)
NN
NN
Environment
Dataset
Synthethic Data / Real Network Data capacity O-CU O-CU a(1) parameters ( K ) state Policy Slice K: computation
Required
Resource
per tenant
usage quota
/cell
Evaluation NN
O-DU
O-DU
r(K)
(K)
DQN Agent Slice K
NN
Cell 1 O-RU Cell N O-RU s(K) Evaluation Target Experience Obtain a(K)
NN
state
Dataset
a(K)
(k,n)
Performance RRMPolicyRatio
O1 measurements O1
rRMPolicyDedicatedRatio (k,n)
per rRMPolicyMemberList (S-NSSAI)
Management interfaces O-CU O-CU
Interactions between components
O-DU O-DU
Internal interactions inside a component
O-RU O-RU
Cell 1 Cell N
Fig. 5 – Deep Q Network-based cross-slice optimization solution
proposed ML-based solution relies on Multi-Agent process, as illustrated in Fig. 5.
Reinforcement Learning (MARL) based on Deep Q- The solution considered here assumes that the SLA
Network (DQN) whose mathematical details can be specification of the RAN slice requirements is done
found in [33]. An important advantage of this multi- based on three ServiceProfile parameters explained
agent scheme is that it uses slice-specific DQN in Section 3, namely dlThptPerSlice, dlThptPerUe
agents and action selection policies for the training and termDensity, which are directly derived from
and inference processes and, therefore, slices can be the GSMA GST template and used as inputs for the
easily added/removed in the scenario just by different solution components described in more
adding or removing the corresponding agent and detail in the following subsections. The specific
action selection policy. Specifically, as seen in Fig. 5, values of these parameters for the slice k are
as part of the solution there is a Resource Usage denoted as dlThptPerSlice(k), dlThptPerUe(k) and
Quota Computation module that determines the termDensity(k) .
1
values of (k,n) based on the outputs obtained
through the execution of K action selection policies 4.1 RAN cross-slice manager
(k), each one associated to one slice. Each one of This component includes the inference part of the
these policies is specified through a deep neural DQN model and the functions needed to configure
network (NN) defined by a vector of parameters k the RAN nodes through the O1 interface (i.e. it takes
that have been previously learnt during the training the ML inference and actor roles of the O-RAN
1 For the interested reader, the parameters of the algorithm
described in [33] are related to the considered Service Profile
attributes as follows: Scenario Aggregated Guaranteed Bit Rate
(SAGBR)=dlThptPerSlice; Maximum Cell Bit Rate (MCBR) of
slice k in cell n: MCBR(k,n)=dlThptPerUe(k) termDensity(k)
cell n service area.
110 © International Telecommunication Union, 2020