Page 69 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 69
ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4
(a) The complexity of iteration expectation (b) The best reward of ix ed iteration = 10
(c) The CDF of SINRs (d) The PMF of SINRs
Fig. 2 – Comparison of computational complexity and SINR improvements of the proposed dynamic Q algorithm with other industrial methods: MC
(Baseline), Q‑Learning and Sarsa. (a) shows the total iteration number for the optimal parameters; (b) displays their best reward when the iterate
number is ix ed to it = 10, each point on the mean curve of rewards is averaged across 1000 epochs with random , the shadow is the 95%
con idence interval across 40 episodes of three models setting; (c) and (d) give the CDF and PMF of their SINRs.
Fig. 3 displays the application when the optimal action
is sent into the simulator of different models in 10 train‑
ing episodes. The dynamic Q model is of the best average
SINR of = 6.319 dB in the ROI among all models.
In Table 3, we compare the average SINRs, across 6 dif‑
ferent scenarios, for the dynamic Q model against MC,
SARSA, and Q‑Learning with parameters fed from Fig.
2(b). It is clear that the dynamic Q model improves the
(a) Baseline: (b) SARSA: UE SINRs across 6 different environments, particularly in
̄ = −2.036 ̄ = −1.724 comparison with MC, where we achieve the average SINR
improvements of around 8.3 dB, 10.4 dB, 12.2 dB 11.2 dB
and 11.8 dB, respectively.
6. CONCLUSION
In this paper, we propose an RL (i.e. dynamic Q‑learning)
assisted full dynamic beamforming algorithm for the ICI
mitigation 5 MMIMO systems. This miti‑
reduces computational complexity
(c) Q‑learning: (d) Dynamic Q of without knowledge trans‑
̄ = 1.488 ̄ = 6.319 mission channel. Simulation results show the implemen‑
tation complexity is lower and UE SINRs are signi icantly
Fig. 3 – The average SINRs of different RL‑based ICI mitigation algo‑
improved compared to other industrial methods. For ex‑
rithms in 5G MMIMO system, with parameters fed from Fig. 2(b). White
circles are ROI. = 64, = 57, 0 = 100. ample, in the dense Urban‑eMBB scenario, the probability
of weak SINRs in the target cell is about 60% lower and
computational complexity is reduced by more than 50%
compared to the benchmark.
© International Telecommunication Union, 2021 53