Page 121 - Proceedings of the 2018 ITU Kaleidoscope
P. 121

Machine learning for a 5G future




           (  ,   ,   ,   ,    ,    , …    ,    ,    , … ,    ) .  That  is,  the  state  is   the IoT nodes, offering a transmission speed of 100kbps. For
                                        
                       2
                    1
                              
                               1
                                 2
           defined by the length of the generated packet, the priority of   the LoRa connection, we assume that devices transmit with
           such a packet, the remaining battery of the node, the time of   SF=7, and thus, have a throughput of 2.43kbps. Regarding
           the day (  ), the usage level of each RAT, and the occupation   band usage limitations, we limit to 1Mb per day the use of
           of their corresponding queues. When the above vector is fed   5G networks (           = 1 Mb/day) to reduce operational costs.
                                                                          1
           into  the  ANN-based  policy,  an  action  that  maximizes  the   We do not set any daily restrictions on LoRa (   2         = ∞). To
           expected  total  reward  is  obtained.  In  turn,  we  have   illustrate the impact of the off-period policies (imposed in
           mathematically modeled the reward obtained for each action   most  European  countries)  on  the  performance  of  IoT
           as follows:                                        networks, we simulate two environments, one with such a
                                                              policy enforced and one without it. Energy consumption of
                           (  ,    ) =    ⋅
                                
                                               (   )          both RATs is taken from [26] and [27] for cellular and LoRa-
                                              
                                                              based RAT respectively –as determining it is out of the scope
           The  reward  is  modeled  as  the  priority  multiplied  by  the   of this work–. We assume that IoT motes are powered by two
           length of the transmitted packet (in bits), divided by the delay   AA batteries designed to last, at least, for three years (i.e. the
           in the transmission. This way, nodes are encouraged to report                                 1
           events as fast as possible, while prioritizing their importance.   maximum allowed energy consumption for a day is  1095  the
           The units of this reward are bits per second, which match the   total  energy  stored  in  such  batteries,  where  1095  is  the
           units  of  a  throughput  metric.  Hence,  what  the  reward   number of days in three years). The average event-generation
           function maximizes is the prioritized throughput of an IoT   rate is varied between 1 packet every 30 seconds (   =  1  )
                                                                                                            30
           node. Note that the delay of the action     is not only related                      1  )  to  assess  the
                                             
           to the bitrate of the   -th RAT but also to the occupation of   and  1  packet  every  10  minutes  (    =  600
           the  queue  of  such  a  RAT.  Also,  since  battery  depletion   influence of this figure on the total attained reward –these
           prevents  nodes  from  reporting  more  events  (and  hence,  0   values are in-line with typical wireless monitoring networks
                                                   ∗
           reward  is  obtained  from  that  point  onwards),     naturally   [28]–[30]–.  Furthermore,  packets  of  varying  sizes  are
           optimizes energy consumption as well.              assumed to be generated; specifically, lengths are randomly
                                                              generated  with  uniform  distribution  between  30  and  200
                   6.  SIMULATIONS AND RESULTS                bytes  (being,  again,  in  accordance  with  typical  IoT
                                                              deployments  [11].  Similarly,  priorities  of  packets  are  also
                                                              uniformly  distributed  between  0  and  1.  Finally,  Table  1
           To evaluate the mathematical framework and its solution via
           RL,  we  have  simulated  an  IoT  network  in  which  nodes   specifies all considered parameters.
           support  two  RATs  (that  is,     = 2 ):  a  cellular-based
           connection (this might represent future 5G cellular links) and   Evolution Strategies algorithm has been allowed to run for
           a LoRa transceiver. Thus,     and     indicate the actions of   1000 iterations to iteratively tune the ANN-defined policy.
                                  1
                                        2
           using 5G and LoRa respectively. It is very common, and will   This ANN is, in turn, composed of two hidden layers of 45
           be even more popular in 5G deployments, that parts of the   and  5  neurons  with  tanh  as  an  activation  function  [31].
           licensed spectrum are sub-rented or even get offered in an   Figure 1 shows the training phase of the ANN-based policy.
           auction-based fashion to third parties (this is the cornerstone   In this process the expected total reward obtained for a whole
           of, e.g., cognitive spectrum applications in 4G/5G networks   day  (that  is,     )  increases  as  the  policy      improves  by
           [24],  [25]).  These  third  parties  may  be,  for  example,   iteratively applying the ES algorithm. This illustrates that the
           operators  of  IoT  networks  or  governments  that  need  a   obtained policy is being more and more refined to make IoT
           wireless infrastructure for their Smart City  initiatives. We   nodes act wiser. Note that transmission policies are trained
           have  modeled  this  spectrum  renting  scheme  and  assumed   off-line in more powerful computers while the practical use
           that a sub-band of a wider licensed band is at the disposal of
                   Table 1 – Parameters of the simulation

                   Parameter                 Value
            5G rate                  100kbps
            LoRa rate                2.43kbps
            LoRa            (if applies)            −          with      = 0.01
                                         
            5G usage limitation (   1        )   1Mb/day
            LoRa usage limitation (   2        )   ∞
            5G power consumption (   1 )   2.15W
            LoRa power consumption (   2 )   0.1353W
            Battery of nodes (  )    2AA batteries (30780 joules
                                     in total, 28.1 joules/day)
                                               1
            Average events per second (  )   Varied from   to   1     Figure 1 – Evolution of    as ES algorithm iterates (first
                                                                                                   1
                                              30  600          250 iterations are shown). Obtained for    =  ⁄  and off-
                                                                                                    90
            Packet length (  )         (30, 200) bytes          period not enforced. Note that γ ultimately indicates the
            Packet priority (  )       (0, 1)                   average number of prioritized bits transmitted in a day.


                                                          – 105 –
   116   117   118   119   120   121   122   123   124   125   126