Page 31 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 31

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4




          This operation is done to increase the SNR by eliminating  4.1 Proposed pattern‐coupled hierarchical
          possibly all‐noise samples.                                model
          As a next step, we apply a whitening  ilter as in the pre‐  To exploit both the block‐sparse structure along AoAs,
          vious approach. The whitened time‐domain signal is ob‐  AoDs, and the common sparsity for all the delay taps, we
                                                                                       
          tained similar to (16) as                            de ine a prior over x ≜ {x ∶    ∈   } as
                                       ̃   
                                −∗
                       −∗
                          ̃
               ̃ y [  ] = D y[  ] = D     h [  ] + ̃ n [  ],  (23)                      
               w       w        w            w                                                     −1
                                                                     (x|  ) = ∏ ∏ ∏      (      |0,     ) .  (27)
                                                                                                  ,             ,     
          where ̃ n [  ] = D n [  ] ∼     (0,    I  ).                           =1       =1   ∈  
                        −∗
                                        2
                           ̃
                 w      w                        
                                                               To model the pattern‐coupled block sparsity, we express
          Note that the following approach is  irst applied to the
                                                               the common parameter      among the delay taps as
          true channels from the training data by regularizing it                          ,     
          with a very small variance white Gaussian noise and uni‐
                                                                            =      +          +      
          form grids for AoAs and AoDs. Then, in the testing stage,           ,             ,                −1,                +1,     
          the grid points are re ined based on the joint AoA/AoD              +          +         ,        (28)
                                                                                          ,      −1
                                                                                                     ,      +1
          pattern that is extracted from the training data. Since the
          sparse model and overall procedure is the same in the  where    = {         ,      } are the hyper‐parameters con‐
          training and testing phases except for the measurement  trolling the sparsity of x. The parameters        ∈ [0, 1]
                                            −∗                 and      ∈  [0, 1] indicate the pattern relevance be‐
          matrices (there is an additional matrix D    multiplying         
                                            w
          the true channels from the left in testing stage in (23)),  tween          ,       and its neighboring coef icients and they
          we directly present the method used in the testing stage.  are taken as known constants in accordance with the re‐
          The channel estimator in both phases operates on the re‐  lated works. Different from [29], we do not impose any
                                                               Gamma prior for the hyper‐parameters {  
          ceived signals ̃ y [  ], for    ∈   .                                                          ,      }. Instead,
                       w
                                                               we consider these hyper‐parameters to be deterministic
          The pattern‐coupled SBL method in [29] assumes noisy  and unknown, which is equivalent to assuming a non‐
          measurements of the form of                          informative prior. In our experiments, we  ind that this
                                                               approach works better than imposing the Gamma prior.
                            y = Ax + n,               (24)
                                                               Note that in the testing stage, the noise variance is not
          where y is the observed vector, A is the measurement ma‐  given explicitly. Instead a range information is provided.
                                                                                                     2
                                                               So, we assume that we do not know    = 1/   , but we in‐
          trix, and the x is the sparse signal with some unknown
          block‐sparsity patterns. The vector n is the zero‐mean  troduce a uniform prior on   , i.e.,    ∼   [   low ,    upp ] where
                                                               the bounds are provided along with the test data. This as‐
          Gaussian noise with scaled identity covariance matrix.
          Hence, the model is in accordance with the one in (23).  sumption also differs from the Gamma distribution that is
                                                   ̃   
                           −∗                                  considered in [29].
          Let us de ine A = D     , y = ̃ y [  ], x = h [  ] and
                           w            w
             
                ̃
          n = n [  ]. Then, we have all the measurements from
                w                                              We utilize an EM algorithm for learning the sparse signal
          (23) for    ∈    in the form                         x and the hyper‐parameters Θ ≜ {  ,   }. In the EM for‐
                                                               mulation, the signal x is treated as a hidden variable, and
                                     
                          
                                
                       y = Ax + n ,      ∈   .        (25)
                                                               we iteratively maximize a lower bound on the posterior
                                                               probability   (Θ|y) (this lower bound is also referred to
                                        
          Let us express the sparse vector x ∈ ℂ             in the follow‐  as the Q‐function). The algorithm alternates between an
          ing form with special indices:                       E‐step and an M‐step. We explain these two steps below.
                                    1,1
                           ⎡   ⋮  ⎤                            4.2 E‐Step
                           ⎢      ⎥
                           ⎢           ,1 ⎥                    In the E‐step, we need to compute the posterior distribu‐
                           ⎢       1,2  ⎥                      tion of x conditioned on the observed data and the hyper‐
                           ⎢   ⋮  ⎥                            parameters estimated from the    iteration, i.e.,
                                                                                           th
                          
                       x =  ⎢      ⎥ ,     ∈   .      (26)
                           ⎢          ,2 ⎥                                     (  )       (  )      (  )
                           ⎢   ⋮  ⎥                                       (x|y, Θ ) ∝    (x|   )    (y|x,     ) .  (29)
                           ⎢      ⎥
                           ⎢     1,      ⎥                     The posterior probability can be computed as a multivari‐
                           ⎢   ⋮  ⎥                            ate Gaussian distribution with mean and covariance ma‐
                                    ⎦                                    
                           ⎣       ,                           trix for x as
                                   
                                                                                        (  )
                                                                                              ∗   
                                                                                  ∗
          Note that the elements of n are independent and iden‐         (  )  =    (  )  (   (  ) A A + D ) −1  A y ,     ∈     (30)
          tically distributed zero‐mean complex Gaussian random        (  )  (  )  ∗  (  )  −1
                               2                                       = (    A A + D )   ,     ∈           (31)
          variables with variance    .
                                             © International Telecommunication Union, 2021                    15
   26   27   28   29   30   31   32   33   34   35   36