Page 132 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 132

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020



          3.   KEY TECHNICAL FEATURES                          3.5  Parallelization
          LCEVC deploys a small number of very specialized     The scheme does not use any inter-block prediction.
          coding tools that are well suited for the type of data   The image is processed by applying small (2x2 or
          it processes. Some of the key technical features are   4x4) independent transform kernels over the layers
          highlighted below.                                   of  residual  data.  Since  no  prediction  is  made
                                                               between  blocks,  each  2x2  or  4x4  block  can  be
          3.1  Sparse residual data processing
                                                               processed independently and in a parallel manner.
          As further shown in Section 5, the coding scheme     Moreover, each layer is processed separately, thus
          processes one or two layers of residual data. This   allowing the decoding of the blocks and decoding of
          residual  data  is  produced  by  taking  differences   the layers to be done in a largely parallel manner.
          between  a  reference  video  frame  (e.g.,  a  source
          video) and a base-decoded upscaled version of the    4.    BITSTREAM STRUCTURE
          video.  The  resulting  residual  data  is  sparse   The LCEVC bitstream contains a base layer, which
          information, typically edges, dots and details which   may be at a lower resolution, and an enhancement
          are then efficiently processed using very simple and   layer  consisting  of  up  to  two  sub-layers.  The
          small transforms which are designed to deal with     following  section  briefly  explains  the  structure  of
          sparse information.
                                                               this  bitstream  and  how  the  information  can  be
          3.2  Efficient use of existing codecs                extracted.
          The  base  codec  is  typically  used  at  a  lower   While the base layer can be created using any video
          resolution. Because of this, the base codec operates   encoder and is not specified further in the LCEVC
          on  a  smaller  number  of  pixels,  thus  allowing  the   standard,  the  enhancement  layer  must  follow  the
          codec  to  use  less  power,  operate  at  a  lower   structure as specified. Similar to other MPEG codecs
          quantization  parameter  (QP)  and  use  tools  in  a   [3][4],  the  syntax  elements  are  encapsulated  in
          more efficient manner.                               network abstraction layer (NAL) units which also
                                                               help   synchronize    the   enhancement     layer
          3.3  Resilient and adaptive coding process           information  with  the  base  layer  decoded
          The scheme allows the overall coding process to be   information. Depending on the position of the frame
          resilient to the typical coding artefacts introduced   within  a  group  of  pictures  (GOP),  additional  data
          by  traditional  discrete  cosine  transform  (DCT)   specifying  the  global  configuration  and  for
          block-based  codecs.  The  first  enhancement        controlling the decoder may be present.
          sub-layer  (L-1  residuals)  enables  us  to  correct   The  data  of  one  enhancement  picture  is  encoded
          artefacts introduced by the base codec, whereas the   into  several  chunks.  These  data  chunks  are
          second  enhancement  sub-layer  (L-2  residuals)     hierarchically organized as shown in Fig. 1. For each
          enables  us  to  add  details  and  sharpness  to  the   processed plane (nPlanes), up to two enhancement
          corrected  upscaled  base  for  maximum  fidelity    sub-layers  (nLevels)  are  extracted.  Each  of  them
          (up to  lossless  coding).  Typically,  the  worse  the   again unfolds into numerous coefficient groups of
          base reconstruction is, the more the first layer may
          contribute  to  correct.  Conversely,  the  better  the
          base  reconstruction  is,  the  more  bit  rate  can  be                         Coefficient Group-1 data
                                                                                                    (0)
          allocated to the second sub-layer to add the finest
          details.                                                                       nLayers
          3.4  Agnostic base enhancement                                                   Coefficient Group-1 data
                                                                                                (nLayers - 1)
          The  scheme  can  enhance  any  base  codec,  from          Plane Y      nLevels
          existing ones (MPEG-2, VP8, AVC, HEVC, VP9, AV1,
          etc.)  to  future  and  under-development  ones                                  Coefficient Group-2 data
          (including  EVC  and  VVC).  The  reason  is  that  the   nPlanes                         (0)
          enhancement operates on a decoded version of the                               nLayers
          base codec in the pixel domain, and therefore it can        Plane V              Coefficient Group-2 data
          be used on any format as it does not require any                                      (nLayers - 1)
          information  on  how  the  base  has  been  encoded
          and/or decoded.                                        Fig. 1 – Encoded enhancement picture data chunk structure




          110                                   © International Telecommunication Union, 2020
   127   128   129   130   131   132   133   134   135   136   137