Page 206 - Kaleidoscope Academic Conference Proceedings 2021

P. 206

2021 ITU Kaleidoscope Academic Conference

diﬀerent network load scenarios representing a light and
heavy network traﬃc to simulate the traﬃc variations along
with the scenes. The simulation alternates light and heavy
traﬃc in each 1000 scenes. The total throughput of the
heavy scenario is bigger than the lighter one. Each user
has a speciﬁc traﬃc magnitude deﬁned as a percentage of
total throughput in accordance with Table 1, enabling the
diﬀerentiation of applications. Figure 6 shows the histogram
of traﬃc throughput for each user in Gbps. Each user presents
the heavy and light traﬃc behavior. The incoming traﬃc for
each user is buﬀered when there is buﬀer space available,
otherwise the excess of packets are tail-dropped. The packets
are also dropped when they occupy the buﬀer for more than
10 seconds.
Table 1 – Network load information for light and heavy
scenarios.
Figure 5 – Example of radiation pattern for speciﬁc beam
index with an 8 × 8 UPA. Network load Total throughput UAV (%) Pedestrian (%) Car (%)
Light 0.48 Gbps 50% 20% 30%
channel model [16]. The reason for this choice is that we ﬁrst Heavy 0.96 Gbps 50% 20% 30%
want to evolve the AI/ML engine such that newer versions
allow rendering in (near) real-time, along the training of the
RL agent. Right now, we are not able to render each scene
and the agent choices, but CAVIAR-v2 is being developed
with this goal. We will later work in the CAVIAR-RT-v1,
where RT stands for support to ray tracing.

The simpliﬁed H currently represents a Line-of-Sight (LoS)
channel. A narrowband channel model [16] is used, but
wideband models can be readily incorporated in case their
extra computational cost is not an issue. For simplicity, the
users have a single antenna (N r = 1) while the BS has a 8 × 8
UPA (N t = 64).

The geometric channel model [16] is adopted with L = 2 Figure 6 – Histogram of packets traﬃc received by the BS
Multipath Components (MPCs): for each user.

p L Õ A A ∗ D D
H = N t N r α ` a r (φ , θ )a (φ , θ ). (3)
t
` ` ` ` 4. MACHINE LEARNING MODEL
`=1
The parameters in Eq. (3) are obtained as follows. The 4.1 Evaluation of RL agents
phase of the complex-gain α ` is obtained from a uniform
To evaluate the RL agent, the return G over the test episodes
distribution with support [0, 2π]. For generating the
is used. The return G e for episode e is
magnitude |α ` |, ﬁrst the distance d between the BS and
the given receiver is used to calculate the received power N s e
Õ
via the Friis equation [17]. The path loss is obtained from G e = r e [t], (4)
this equation and determines |α ` |, which decreases with t=1
d. The elevation φ ` and azimuth θ ` angles, for departure where N e is the number of scenes in episode e. The
A
D
(e.g. φ ) and arrival (e.g. φ ) are obtained from the s
corresponding reward r e [t] at discrete-time t is a weighted
` `
orientation provided by the LoS path. The nominal LoS
sum of transmitted and discarded packets given by
angles are slightly changed by adding to them Gaussian
random variables with zero-mean and variance of 1 degree. P tx [t] − 2P d [t]
r e [t] = (5)
These angles are used to compose the steering vectors a t and P b [t] ,
a r .
where P tx [t], P d [t], and P b [t] correspond, respectively, to the
total amount (summation for all users) of transmitted, dropped
3.1 Traﬃc model and buﬀered packets at time t. The reward r e [t] is restricted
to the range −2 ≤ r e [t] ≤ 1. At each time t, a single user can
The users’ data traﬃc is deﬁned as Poisson processes with be served, but P b [t] accounts for the number of packets in all
time-varying mean λ u [t] for user u. We speciﬁed two three buﬀers. Hence, r e [t] = 1 only if all buﬀered packages

– 144 –

201 202 203 204 205 206 207 208 209 210 211