Page 121 - Proceedings of the 2018 ITU Kaleidoscope

P. 121

Machine learning for a 5G future

( , , , , , , … , , , … , ) . That is, the state is the IoT nodes, offering a transmission speed of 100kbps. For

2
1

1
2
defined by the length of the generated packet, the priority of the LoRa connection, we assume that devices transmit with
such a packet, the remaining battery of the node, the time of SF=7, and thus, have a throughput of 2.43kbps. Regarding
the day ( ), the usage level of each RAT, and the occupation band usage limitations, we limit to 1Mb per day the use of
of their corresponding queues. When the above vector is fed 5G networks ( = 1 Mb/day) to reduce operational costs.
1
into the ANN-based policy, an action that maximizes the We do not set any daily restrictions on LoRa ( 2 = ∞). To
expected total reward is obtained. In turn, we have illustrate the impact of the off-period policies (imposed in
mathematically modeled the reward obtained for each action most European countries) on the performance of IoT
as follows: networks, we simulate two environments, one with such a
policy enforced and one without it. Energy consumption of
( , ) = ⋅

( ) both RATs is taken from [26] and [27] for cellular and LoRa-

based RAT respectively –as determining it is out of the scope
The reward is modeled as the priority multiplied by the of this work–. We assume that IoT motes are powered by two
length of the transmitted packet (in bits), divided by the delay AA batteries designed to last, at least, for three years (i.e. the
in the transmission. This way, nodes are encouraged to report 1
events as fast as possible, while prioritizing their importance. maximum allowed energy consumption for a day is 1095 the
The units of this reward are bits per second, which match the total energy stored in such batteries, where 1095 is the
units of a throughput metric. Hence, what the reward number of days in three years). The average event-generation
function maximizes is the prioritized throughput of an IoT rate is varied between 1 packet every 30 seconds ( = 1 )
30
node. Note that the delay of the action is not only related 1 ) to assess the

to the bitrate of the -th RAT but also to the occupation of and 1 packet every 10 minutes ( = 600
the queue of such a RAT. Also, since battery depletion influence of this figure on the total attained reward –these
prevents nodes from reporting more events (and hence, 0 values are in-line with typical wireless monitoring networks
∗
reward is obtained from that point onwards), naturally [28]–[30]–. Furthermore, packets of varying sizes are
optimizes energy consumption as well. assumed to be generated; specifically, lengths are randomly
generated with uniform distribution between 30 and 200
6. SIMULATIONS AND RESULTS bytes (being, again, in accordance with typical IoT
deployments [11]. Similarly, priorities of packets are also
uniformly distributed between 0 and 1. Finally, Table 1
To evaluate the mathematical framework and its solution via
RL, we have simulated an IoT network in which nodes specifies all considered parameters.
support two RATs (that is, = 2 ): a cellular-based
connection (this might represent future 5G cellular links) and Evolution Strategies algorithm has been allowed to run for
a LoRa transceiver. Thus, and indicate the actions of 1000 iterations to iteratively tune the ANN-defined policy.
1
2
using 5G and LoRa respectively. It is very common, and will This ANN is, in turn, composed of two hidden layers of 45
be even more popular in 5G deployments, that parts of the and 5 neurons with tanh as an activation function [31].
licensed spectrum are sub-rented or even get offered in an Figure 1 shows the training phase of the ANN-based policy.
auction-based fashion to third parties (this is the cornerstone In this process the expected total reward obtained for a whole
of, e.g., cognitive spectrum applications in 4G/5G networks day (that is, ) increases as the policy improves by
[24], [25]). These third parties may be, for example, iteratively applying the ES algorithm. This illustrates that the
operators of IoT networks or governments that need a obtained policy is being more and more refined to make IoT
wireless infrastructure for their Smart City initiatives. We nodes act wiser. Note that transmission policies are trained
have modeled this spectrum renting scheme and assumed off-line in more powerful computers while the practical use
that a sub-band of a wider licensed band is at the disposal of
Table 1 – Parameters of the simulation

Parameter Value
5G rate 100kbps
LoRa rate 2.43kbps
LoRa (if applies) − with = 0.01

5G usage limitation ( 1 ) 1Mb/day
LoRa usage limitation ( 2 ) ∞
5G power consumption ( 1 ) 2.15W
LoRa power consumption ( 2 ) 0.1353W
Battery of nodes ( ) 2AA batteries (30780 joules
in total, 28.1 joules/day)
1
Average events per second ( ) Varied from to 1 Figure 1 – Evolution of as ES algorithm iterates (first
1
30 600 250 iterations are shown). Obtained for = ⁄ and off-
90
Packet length ( ) (30, 200) bytes period not enforced. Note that γ ultimately indicates the
Packet priority ( ) (0, 1) average number of prioritized bits transmitted in a day.

– 105 –

116 117 118 119 120 121 122 123 124 125 126