Page 39 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 39
ITU Journal: ICT Discoveries, Vol. 3(1), June 2020
MULTI-VIEWPOINT AND OVERLAYS IN THE MPEG OMAF STANDARD
Igor D.D. Curcio, Kashyap Kammachi Sreedhar, Sujeet S. Mate
Nokia Technologies, Tampere, Finland
Abstract – Recent developments in immersive media have made possible the rise of new multimedia
applications and services that complement the traditional ones, such as media streaming and conferencing.
Omnidirectional video (a.k.a. 360-degree video) is enabling one of such new services that are progressively
made available also by large media distribution portals (e.g., YouTube). With the aim of creating a
standardized solution for 360-degree video streaming, the Moving Picture Experts Group (MPEG) has
developed the Omnidirectional MediA Format (OMAF) second edition, or version 2, which is close to
completion. The major new features of OMAFv2, compared to the first version, include (but are not limited
to) the capability of using overlays and multiple omnidirectional cameras situated at different physical
points (i.e., viewpoints). This paper focuses on the description of two of the new OMAFv2 features, the
overlays and the multi-viewpoints, including the 360-degree video use cases enabled by these two features.
Keywords – Immersive media, MPEG OMAF, multimedia streaming, multi-viewpoints, omnidirectional
video, overlays.
1. INTRODUCTION their specification for omnidirectional video
streaming since Release 15 [4]. OMAF defines the
Immersive media is one of the current buzzwords in basic storage format as well as the transport over
media technologies. It refers to the capability of Dynamic Adaptive Streaming over HTTP (DASH)
making the user feel immersed in the audio, video, [23] and MPEG Media Transport (MMT) [24] for
and other media, at the same time increasing the audio, video, image, and timed text. Lately, MPEG
interactivity level. The Reality-Virtuality continuum has been working on the second version of the
[1], allows a wide spectrum of immersion and OMAF standard [5] with the aim of extending the
interactivity levels, more towards the real functionalities already enabled by the first version,
environment or the virtual environment, depending and make its adoption more appealing for service
on the actual application or service considered.
providers and the media industry in general.
Watching 360-degree videos is one way to consume The major features specified in OMAFv2 are overlays,
immersive media. 360-degree video content is multi-viewpoints, sub-pictures and new tiling
typically played back in a virtual environment using profiles for viewport-dependent streaming. This
a Head Mounted Display (HMD). Whenever the user paper will focus on the first two ones. Overlays are a
is enabled to explore the content only by changing way to enhance the information content of 360-
the HMD orientation by varying the yaw, pitch and degree video. They allow us to superimpose another
roll of the head (i.e., rotational movements), this is piece of content (e.g., a picture, another video with
defined as a 3 Degrees of Freedom (3DoF) media [2]. news, advertisements, text or other) on top of the
YouTube already offers omnidirectional video in main (background) omnidirectional video. Overlays
their portal, and this type of medium is becoming also allow the creation of interactivity points or areas.
more and more popular. If the consumer is also The content captured by an omnidirectional capture
allowed to move in the 360-degree space and device or an omnidirectional media corresponding to
navigate, walk, see behind the objects in the scene one omnidirectional camera is called a viewpoint in
(i.e., translational movements), this is typically the OMAFv2 terminology. Multi-viewpoint is a set of
defined as 6 Degrees of Freedom (6DoF) media [2].
capture devices which, for example, may be scattered
The Moving Picture Experts Group (MPEG) has around a stadium. The OMAFv2 specification enables
defined the first standard for an Omnidirectional a streaming format with multiple viewpoints to allow,
MediA Format (OMAF) [3] to enable the easy for example, switching from one viewpoint to
deployment of interoperable standardized another, as done by multi-camera directors for
streaming services for 360-degree video. OMAF is traditional video productions.
also the basis of the technology adopted by the
Third Generation Partnership Project (3GPP) in
© International Telecommunication Union, 2020 17