Page 124 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 124

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020





                                                                                                         ◦
          1.1 Directional weighting estimation in ITU-R       Table 2 – Binaural summation gains computed for φ = 0 and
               BS.1770-4                                      directional weights proposed by the authors of [4] to RG-32.
          Directional weights in the Recommendation were sug-  Azimuth (θ)  0 ◦  ±30 ◦  ±60 ◦  ±90 ◦  ±110 ◦  ±135 ◦  180 ◦
                                                               Computed
          gested in contributions to the Rapporteur Group, and later
                                                               levels (dB)  0.00  1.36  4.47  5.22  4.46  0.84  −8.25
          disclosed by Komori et al. [4]. Although the documents do  Normalised
          not provide further detail on how these calculations were  gains (dB)  0.00  0.39  1.29  1.50  1.28  0.24  −2.37
                                                                Proposed
          made, they can be traced back to the references below.
                                                               weights (dB)  0.00  0.00  1.50  1.50  1.50  0.00  −1.50
          Robinson and Wittle first performed a subjective test to in-  summation law in [5]. The authors observed that the effect
          vestigate loudness as a function of the orientation of the  of the contralateral incidence in the response variable was
          sound source. Through a series of sound pressure level  larger in the listening test with naive participants, although
          (SPL) measurements at the ears of the listeners, the authors  the difference in binaural gains in both studies might be due
          stated a binaural summation law of the form:        to chance, according to their statistical analysis [8].
                                 L le ft  L right

                    L = g×log 2 2  g  +2  g  ,        (1)     Additionally, experiments in [5, 6] were conducted in ane-
                                                              choic chambers using single channel narrowband noises as
                                                                                                   ◦
          where g is a 6 dB binaural gain and L is the sound pressure  stimuli, while the derived weights for |φ| < 30 were tested
          level equivalent to any combination of incident sound pres-  in [3, 4] with broadband content rendered to 5.1, 7.1 and
          sure levels, being them diotic (L left = L right ) or dichotic  22.2 loudspeaker settings, resulting in different correlations
          (L left 6= L right ) [5].                           between objective measurements and subjective scores ob-
                                                              served in the test sites. It is possible that these different
          A different binaural summation gain for Equation (1) was  results were due to elevation effects not accounted for in
          derived with the method proposed by Sivonen and Eller-  the weighting scheme of Table 1. Therefore, the question
          meier in an experiment with narrowband, anechoic stim-  on how to model directional effects on the ITU-R loudness
          uli. The experimental gain g was estimated by a mini-  algorithm requires further investigation.
          mization of the sum-of-squares of the errors (SSE) between
          the directional loudness sensitivities (DLS) of listening test  The goal of the present study was to obtain subjective data
          subjects and the sensitivities computed by Equation (1).  on directional effects in order to estimate a new set of bin-
          Squares were summed across I azimuth angles and J repe-  aural summation gains. The next sections contain a de-
          titions [6]. The minimum SSE is calculated as:      scription of the listening test, followed by an attempt to
                                                              reproduce the estimation that led to ITU-R BS.1770-4 and
               "                                 #
                 I  J
                                                	 2           by a new approach to the problem. The modified algorithm
            min  ∑ ∑   DLS i,j − L comp i (g)−L ref (g)  ,  (2)
             g                                                was then tested against a different set of subjective data on
                 i=1 j=1
                                                              multichannel audio.
                     and L ref are levels computed with Equa-
          where L comp i
          tion (1), corresponding to the compared incidence, and to
          the frontal incidence of reference, respectively. The study  2.  LISTENING TEST
                             , ∀i from individual Head-Related
          obtained L ref and L comp i
          Transfer Functions (HRTFs) of the expert subjects in their  A loudness matching test was undertaken to obtain DLS re-
          listening test. A value of g ≈ 3 dB was then estimated by  sponses through SPL adjustments required for equal loud-
          averaging Equation (2) computations per participant.  ness of sounds coming from different azimuths and eleva-
                                                              tions. For this listening test, a 22-channel electroacoustic
          Authors in [4] computed the channel weighting values, or  system was used to reproduce broadband pink noise test
          binaural loudness summation gains, in Table 2 by comput-  signals in a ITU-R BS.1116 critical listening room [10] .
          ing Equation (1) with g = 3 dB and L left (θ) and L right (θ)
          obtained from HRTFs of each azimuth angle θ = ϑ in the  2.1 Design
          table. Based on the verification that the effect of incidence
          angle on loudness is attenuated for wideband and rever-  Broadband pink noise stimuli, bandlimited from 200 Hz to
          berant sounds [7], the authors chose to normalize results  15 kHz, were reproduced by a 22 loudspeaker setup speci-
          to 1.5 dB and approximate them in 1.5 dB steps to ensure  fied as layout ‘H’ in Recommendation ITU-R BS.2051 for
          backward compatibility, leading to the directional weight-  advanced sound systems [9]. The layout is described in
          ing gains of the ITU model summarized in Table 1.   Table 3 where labels indicate bottom, middle, upper and
                                                              top loudspeakers; and their correspondent azimuths. The
          However, a further study by the authors in [6] computed
                                                              time-aligned and level-equalized system was mounted in an
                                       , ∀i obtained through
                                                              ITU-R BS.1116 standard listening room with dimensions
          Equation (2) with L ref and L comp i
          SPL measurements taken with a Head and Torso Simula-
                                                              7.35m length, 5.7m width, and 2.5m height [11]. Mean re-
          tor (HATS), and with DLS subjective scores taken from
                                                              verberation time between 500 Hz and 1 kHz octave bands
          naive participants. Minimization of the objective function
                                                              is RT 60 = 0.22s.
          in Equation (2) yielded g ≈ 6 dB, closer to the binaural
           102                               © International Telecommunication Union, 2020
   119   120   121   122   123   124   125   126   127   128   129