Page 63 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 63

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4







               A DYNAMIC Q‑LEARNING BEAMFORMING METHOD FOR INTER‑CELL INTERFERENCE
                                   MITIGATION IN 5G MASSIVE MIMO NETWORKS

                                              1
                                                                          1
                                                          1
                               Aidong Yang, Ph.D , Xinlang Yue , Mohan Wu, Ph.D , Ye Ouyang, Ph.D 1
                              1
                               Telecom Arti icial Intelligence Lab, AsiaInfo Technologies, Beijing, China
                                 NOTE: Corresponding author: Aidong Yang, Ph.D, yangad@asiainfo.com

          Abstract – Beamforming is an essential technology in 5G Massive Multiple‑Input Multiple‑Output (MMIMO) communica‑
         tions, which are subject to many impairments due to the nature of wireless transmission channel. The Inter‑Cell Interference
         (ICI) is one of the main obstacles faced by 5G communications due to frequency‑reuse technologies. However,  inding the opti‑
         mal beamforming parameter to minimize the ICI requires infeasible prior network or channel information. In this paper, we
         propose a dynamic Q‑learning beamforming method for ICI mitigation in the 5G downlink that does not require prior network
         or channel knowledge. Compared with a traditional beamforming method and other industrial Reinforcement Learning (RL)
         methods, the proposed method has lower computational complexity and better convergence ef iciency. Performance analy‑
         sis shows the quality of service improvement in terms of Signal‑to‑Interference‑plus‑Noise‑Ratio (SINR) and the robustness
         towards different environments.
         Keywords – 5G beamforming, inter‑cell interference, massive MIMO, reinforcement learning

          1.  INTRODUCTION                                         paper    RL‑assisted  full    beamform‑
                                                                 method    developed      iciently      op‑
          Massiv  Multiple‑Input  Multiple‑Output  (MMIMO)  tech‑   timal          MMIMO  system
              5      compet    that  signi icantly               address    issues.    fully  consider    microcell
          improves system capacity  signal coverage and spectral‑     macro‑cell  multi‑path  transmission    which
          ef iciency by con iguring hundreds of Antenna Elements   present radio features with high user density and traf ic
            at      Station    t    effectiv  beam‑            loads focusing on pedestrian and vehicular users (Dense
          forming      However    quality  of  MMIMO  beam‑    Urban‑eMBB) scenarios [1, 2, 5], such as buildings, moun‑
          forming depends on accurate Channel State Information       rivers,  where      of  User  Equip‑
          (CSI), pilot contamination and ICI estimation   More‑     (UEs)    infrequently;  these  factors  signif‑
          over    MMIMO  beamforming  complexity  becomes  a   icantly  impact  coverage.  T    optimal  beamforming,
          challenge as the number of AEs at the BS increases. There‑     irstly        Poisson        to
          for      necessary  t  explor    effectiv    ef icient   estimate the occurrences of UE in the target cells with a
          beamforming method for ICI mitigation with low power   long‑term data statistical analysis; secondly, we apply an
          and low complexity [4].                              algorithm to fast search through huge volumes of param‑
                                                               eters and obtain optimal   Lastly, we send the op‑
            r  years,    accurat  MMIMO  beamforming  has      timal parameters into the BS beamforming simulator for
          attracted extensive research [3, 4, 5, 6, 7], which almost   the best SINR.
          follo  tw    directions:      without  C  Hy‑
          brid beamforming [3, 4, 5] is the representative of the for‑   In summary, the main contribution of this work includes:
          mer  It    t  reduce    e  of  R  Frequency
                                                                 • The  proposed  RL  beamforming  method  for  an
          (RF) chains and decrease the complexity of beamforming
                                                                   MMIMO  system  is  meant  to  get  the  optimal  beam‑
          compared to conventional methods [2], but it needs to up‑
                                                                   forming parameter, such a method with multi‑cell ICI
          date beams frequently when pilots are received continu‑
                                                                   is rarely discussed in literature.  Besides, it does not
          ally at the BS. A smart pilot assignment scheme, which is
                                                                   need any prior network or channel information and
          effective in mitigating interference but is aimed at a sin‑
                                                                   it works for different UE distribution.
          gle cell, is proposed in [5] to reduce pilot contamination
          by smartly assigning orthogonal pilots to users.  The lat‑   • Compared with the traditional beamforming method
          ter  mainl        irst  Mont  Car    method,             and other industrial RL methods,  the proposed dy‑
          which searches the optimal beamforming parameters but    namic  Q‑learning  method  shrinks  the  action  space
          suffers  fr  incr  computational  complexity  and        during its process, thus it requires less time and com‑
          second supervised Deep Learning (DL) methods.  One of    putational complexity to operate.
          them is reported in [7] to research the characters of wire‑
          less spatial channels and explore preferable pilot assign‑   • As proven in many simulation results, the proposed
            for          beamforming,                              method  performs  better  than  the  other  methods.
          but supervised methods require model training before-    Moreover, it is robust to various starting states and
          hand and time-consuming sample data collection.          different environments.


                                             © International Telecommunication Union, 2021                    47
   58   59   60   61   62   63   64   65   66   67   68