Page 113 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 113
ITU Journal: ICT Discoveries, Vol. 3(1), June 2020
BITPLANE
Light Field
4D BLOCK
4D-DCT
HEXADECA-TREE
PARTIONING
ENCODER
DECOMPOSITION
Fig. 5 – 4DTM encoder block diagram. ARITHMETIC bitstream
Fig. 6 – 4DTM 4D-block spatial and view partitioning.
The R-D optimized hexadeca-tree structure dictates the 4.2.2 4D-Prediction Mode (4DPM)
quantization and entropy encoding steps. This tree is
built by recursively subdividing a 4D block until all sub- In the 4DPM, depth information and camera parameteri-
blocks reach a 1×1×1×1 4D block-size. Starting from a zation are used for efficient representation and coding of
4D block of size × × × , and a bit plane initially set intermediate views from a set of reference views. As the
to a maximum bit plane value three operations are per- first step, the 4DPM method takes the light field , con-
formed represented by a set of ternary flags: lowerBit- sisting of a set of views indexed by {( , )| = 0, … , −
plane, splitBlock and zeroBlock. The first one lowers the 1, = 0, … , − 1}, and partitions it into ℎ hierar-
bit plane, where the descendant of the node is another chical subsets, as indicated by the label matrix ( , ) ∈
block with the same dimensions as the original one, but {ℎ , … , ℎ }, where each subset of views corresponds to
0
represented with precision bitplane-1. Another opera- a particular hierarchical level, as illustrated in Fig. 8. Ref-
tion splits the block meaning that a node will have up to erence views occupy the lowest hierarchical level ℎ = 1,
0
16 children, each one associated to a sub-block with ap- and intermediate views (having ( , ) > 1) are recon-
proximately half the length of the original block in all four structed based on the views on the lower hierarchical lev-
dimensions. The remaining operation discards the block, els [16]. The hierarchical partitioning of views used in
indicating that the node has no children and is repre- 4DPM is therefore similar to the frame referencing struc-
sented by an all-zeros block, generating three Lagrangian tures used in video codecs such as HEVC. However, the ef-
costs , and , respectively. The optimization proce- ficiency of the hierarchical coding order in 4DPM is based
1
2
0
dure is called recursively for each sub-block and the re- on inter-view redundancies in the angular arrangement
turned Lagrangian costs are added to obtain the new R-D of the views in the light field, in a way that can be paral-
cost and its associated flag: lowerBitplane, splitBlock or leled to the inter-frame dependencies in video coding.
zeroBlock.
Reference view, depth and view prediction information
is encoded by default with JPEG 2000. However, the
The 4D coefficients, flags, and probability context infor-
JPEG Pleno file format syntax allows also support for a
mation generated during the encoding process are in-
whole series of other codec technologies such as JPEG-
put to the arithmetic encoder, which generates the com-
1, JPEG LS, JPEG XS, and moreover in principle 4DPM can
pressed representation of the light field.
be combined with any still image encoder. Different hi-
erarchical configurations can be selected depending on
Random access capability is an important feature of the the characteristics of the light field data and the desired
4DTM.As the lightfield is divided intofixed-size4D blocks bit rate. For example, a micro-lens based plenoptic image
(e.g. × × × ) that are independently encoded, ran- with high inter-view redundancy can be efficiently coded
dom access is provided. Another important feature of the using a single reference view at ℎ , while for wide base-
0
4DTM is the uniform quality of the reconstructed views. line light fields, such as those obtained with camera ar-
This feature is very important for applications such as re- rays, it may be more suitable to select multiple reference
focusing. views to occupy the lowest hierarchical level ℎ . For low
0
© International Telecommunication Union, 2020 91