Page 63 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks

P. 63

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4

A DYNAMIC Q‑LEARNING BEAMFORMING METHOD FOR INTER‑CELL INTERFERENCE
MITIGATION IN 5G MASSIVE MIMO NETWORKS

1
1
1
Aidong Yang, Ph.D , Xinlang Yue , Mohan Wu, Ph.D , Ye Ouyang, Ph.D 1
1
Telecom Arti icial Intelligence Lab, AsiaInfo Technologies, Beijing, China
NOTE: Corresponding author: Aidong Yang, Ph.D, yangad@asiainfo.com

Abstract – Beamforming is an essential technology in 5G Massive Multiple‑Input Multiple‑Output (MMIMO) communica‑
tions, which are subject to many impairments due to the nature of wireless transmission channel. The Inter‑Cell Interference
(ICI) is one of the main obstacles faced by 5G communications due to frequency‑reuse technologies. However, inding the opti‑
mal beamforming parameter to minimize the ICI requires infeasible prior network or channel information. In this paper, we
propose a dynamic Q‑learning beamforming method for ICI mitigation in the 5G downlink that does not require prior network
or channel knowledge. Compared with a traditional beamforming method and other industrial Reinforcement Learning (RL)
methods, the proposed method has lower computational complexity and better convergence ef iciency. Performance analy‑
sis shows the quality of service improvement in terms of Signal‑to‑Interference‑plus‑Noise‑Ratio (SINR) and the robustness
towards different environments.
Keywords – 5G beamforming, inter‑cell interference, massive MIMO, reinforcement learning

1. INTRODUCTION paper RL‑assisted full beamform‑
method developed iciently op‑
Massiv Multiple‑Input Multiple‑Output (MMIMO) tech‑ timal MMIMO system
5 compet that signi icantly address issues. fully consider microcell
improves system capacity signal coverage and spectral‑ macro‑cell multi‑path transmission which
ef iciency by con iguring hundreds of Antenna Elements present radio features with high user density and traf ic
at Station t effectiv beam‑ loads focusing on pedestrian and vehicular users (Dense
forming However quality of MMIMO beam‑ Urban‑eMBB) scenarios [1, 2, 5], such as buildings, moun‑
forming depends on accurate Channel State Information rivers, where of User Equip‑
(CSI), pilot contamination and ICI estimation More‑ (UEs) infrequently; these factors signif‑
over MMIMO beamforming complexity becomes a icantly impact coverage. T optimal beamforming,
challenge as the number of AEs at the BS increases. There‑ irstly Poisson to
for necessary t explor effectiv ef icient estimate the occurrences of UE in the target cells with a
beamforming method for ICI mitigation with low power long‑term data statistical analysis; secondly, we apply an
and low complexity [4]. algorithm to fast search through huge volumes of param‑
eters and obtain optimal Lastly, we send the op‑
r years, accurat MMIMO beamforming has timal parameters into the BS beamforming simulator for
attracted extensive research [3, 4, 5, 6, 7], which almost the best SINR.
follo tw directions: without C Hy‑
brid beamforming [3, 4, 5] is the representative of the for‑ In summary, the main contribution of this work includes:
mer It t reduce e of R Frequency
• The proposed RL beamforming method for an
(RF) chains and decrease the complexity of beamforming
MMIMO system is meant to get the optimal beam‑
compared to conventional methods [2], but it needs to up‑
forming parameter, such a method with multi‑cell ICI
date beams frequently when pilots are received continu‑
is rarely discussed in literature. Besides, it does not
ally at the BS. A smart pilot assignment scheme, which is
need any prior network or channel information and
effective in mitigating interference but is aimed at a sin‑
it works for different UE distribution.
gle cell, is proposed in [5] to reduce pilot contamination
by smartly assigning orthogonal pilots to users. The lat‑ • Compared with the traditional beamforming method
ter mainl irst Mont Car method, and other industrial RL methods, the proposed dy‑
which searches the optimal beamforming parameters but namic Q‑learning method shrinks the action space
suffers fr incr computational complexity and during its process, thus it requires less time and com‑
second supervised Deep Learning (DL) methods. One of putational complexity to operate.
them is reported in [7] to research the characters of wire‑
less spatial channels and explore preferable pilot assign‑ • As proven in many simulation results, the proposed
for beamforming, method performs better than the other methods.
but supervised methods require model training before- Moreover, it is robust to various starting states and
hand and time-consuming sample data collection. different environments.

58 59 60 61 62 63 64 65 66 67 68