Page 51 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media

P. 51

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020

all MIP-prediction modes is equal to 8 Kilobyte. This cor- In the case of a 4 × 4-block, for 0 ≤ i < 2, one defines
responds to a memory reduction by a factor 1000 and by a top top top
factor 100 in comparison to the methods of section 3 and bdry red [i] = bdry [2i] + bdry [2i + 1].
section 4, respectively. The key idea to achieve the afore-
In all other cases, if the block-width W is given as W =
mentioned two complexity constraints is to use down- n
4 · 2 , for 0 ≤ i < 4 one defines
sampling and up-sampling operations in the domain of
the prediction input and output. 1 2 −1
n
X
n
For predicting the samples of a W ×H-block, W and H in- bdry top [i] = bdry top [2 · i + j].
red n
teger powers of two between 4 and 64, MIP takes one line 2 j=0
of H and W reconstructed neighboring boundary sam-
plesleftandabovetheblockasinput. Then, theprediction The reduced left boundary bdry left is defined analo-
red
signal is generated using the following three steps which gously. The two boundaries bdry top and bdry left are con-
red red
are also summarized in Figure 6: catenated to form the reduced boundary
left
1. From the boundary samples, four samples in the case bdry red = [bdry red , bdry top ]; (2)
red
W = H = 4 and eight samples, else, are extracted by
averaging. see Fig. 7. It has size 4 for 4 × 4 blocks and size 8, else-
where.
2. A matrix-vector multiplication, followed by addition
of an offset, is carried out with the averaged samples
as an input. The result is a reduced prediction signal
on a subsampled set of samples in the original block.
3. The prediction signal at the remaining positions is
generated from the prediction signal on the subsam-
pled set by linear interpolation.

Fig. 7 – The averaging step for an 8 × 8-block. This results in four sam-
ples (two in the case of 4 × 4-blocks) along each axis.

In the second step, out of the reduced input vector bdry red
one generates a reduced prediction signal pred red . The
lattersignalisasignalonthedownsampledblockofwidth
W red and height H red . Here, W red and H red are defined
Fig.6–Theflowchartofmatrix-basedintra-predictionforW×H-block.
as:
The averaging step on the boundary, which is performed
W red = 4, H red = 4; if W = H = 4
for all MIP-modes, could be interpreted as a low complex-
ity version of the joint feature extraction that was part W red = min(W, 8), H red = min(H, 8); elsewhere.
of the neural-network-based intra-prediction; see section
The reduced prediction signal pred red of the i-th predic-
3. Moreover, one could rephrase the linear interpolation
tion mode is computed by calculating a matrix vector-
step by saying that each MIP-mode predicts into the trans-
product and adding an offset:
form domain of the (5, 3)-wavelet transform, where only
low subbands are predicted to be non-zero. Thus, con-
pred red = A i · bdry red + b i . (3)
ceptionally, this part of MIP-prediction is similar to the
prediction into the DCT-domain described in the previous Here, A i is a matrix that has W red · H red rows and 4
section 3. However, note that for the predictors predict- columns if W = H = 4 and 8 columns in all other cases.
ing into the DCT-domain, not all high frequency compo- Moreover, b is a vector of size W red · H red .
nents of the prediction signal were set to zero but rather a The matrices and offset vectors needed to generate the
more flexible sparsity pattern was used whereas the MIP- prediction signal are taken from three sets S 0 , S 1 , S 2 . The
predictors are constrained to generate only low-pass sig- set S 0 consists of 18 matrices each of which has 16 rows
nals. and 4 columns and 18 offset vectors of size 16. Matrices
We now describe each of the three steps in the MIP pre- and offset vectors of that set are used for blocks of size
diction in more detail. In the first step, the left and 4×4. The set S 1 consists of 10 matrices , each of which has
top input boundaries bdry top and bdry left are reduced to 16 rows and 8 columns and 10 offset vectors of size 16.
top left top
smaller boundaries bdry and bdry . Here, bdry
red red red Matrices and offset vectors of that set are used for blocks
left
and bdry both consists of 2 samples in the case of a of sizes 4×8, 8×4 and 8×8. Finally, the set S 2 consists of
red
4×4-block and both consist of 4 samples in all other cases. 6 matrices , each of which has 64 rows and 8 columns and

46 47 48 49 50 51 52 53 54 55 56