Page 24 - Kaleidoscope Academic Conference Proceedings 2022
P. 24
can be considered “consciousness within a restricted domain” As we showed in [11], all XR communication systems include
[7]: You feel, think and behave as if the place were real, one or several of these building blocks, and the combination
even though you know it is not. This property is key in the
of blocks they address has a strong impact on the technical
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING VOL. XX, NO. X, XXX XXXX success of immersive technologies, since consciousness is architecture they implement. In the next subsection we will 4
describe how we are addressing each of them, with the target
a very primitive cognitive function, not associated with any
cortical region [8]. In other words, the perceptual illusion of of building a distributed reality system which is able to
“being there” created by XR technology really addresses the provide all of them simultaneously.
deepest layers of the human brain, thus making it universally
applicable to any kind of person. 2.1 Visit: remote presence
This perceptual illusion of presence involves different factors
[9]. Biocca et al. identified three key components [10]: To address the problem of immersive remote presence we
spatial presence, the illusion of being in a different place; have developed The Owl, a prototype 360-degree video-based
social presence, the illusion of being and interacting with telepresence system. 360-degree videos are video recordings
other people; and self-presence, the illusion of having one’s where a view in every direction is recorded at the same
body integrated in the virtual experience. Creating the time, shot using an omnidirectional camera or a collection
realverse means addressing those components by transmitting of cameras. Our prototype, shown in Figure 2, consists of a
or generating this sense of presence. Technologicaly, it commercial omnidirectional video camera, a control system
requires implementing the four perceptual building blocks on a Raspberry Pi 4, a backend in the cloud, and a client for
shown in Figure 1 [11]: Meta Quest developed in the 3-D version of the game engine
Unity. The system allows video transmission in real time,
in equirectangular projection1, with 4K resolution (8 Mbps)
and conversational delay (< 500 ms), as well as audio in both
directions. The approach is similar to other prototypes in the
Spatial
Presence state of the art, such as ExLeap [13] or Show Me Around [14].
Meet Visit
Face Social Self Move
Presence Presence
Figure 1 – Building blocks (or fundamental elements) of XR
communication systems: face, visit, meet, and move [11].
1. 1Face is
© 2022 Nokia the property of the system to transmit in real
time a visual representation of the other person, e.g.
through a video-conferencing system. This element Raspberry
Pi 4
enables visual communication. Seeing the other person
is key to transmitting non-verbal communication cues,
including showing objects of the personal space.
2. Visit is the property of the system to transmit in real time
a visual representation of the surroundings of the other
person. This enables remote presence: the sense of
(a) Coffee shop (b) International office person and being able to operate and discuss about it. (c) Study in Spain
Figure 2 – The Owl: an immersive communication system
“being there”, in the physical environment of the remote
prototype (bottom left), which allows a person wearing a
Fig. 2. Video sources screenshots 3. Meet is the property of the system to represent the other head-mounted display (bottom right) to feel present in a
remote location and interact with the people there (top).
person in the same (virtual or physical) space as the user.
The Owl has been
© 2022 Nokia field-tested in different scenarios, and
2
This enables shared immersion: being immersed in the
TABLE 2 same (virtual or physical) environment and interacting has allowed us to evaluate 360 degree video technology for
real-time communications in use cases such as education [15]
with the same (virtual or physical) objects.
or hybrid conferences, with face-to-face and virtual attendees
Test content characteristics 4. Move is the property of the system itself to represent [16]. For better assessment of its QoE, a novel methodology
the user within it and enable its embodied interaction.
1 The equirectangular projection maps the longitude and latitude of
It means that the actions of the users are represented
within the system and allow the user to interact with it. the sphere videos to the horizontal and vertical coordinates of the
rectangular video [12].
Name Genre Perspective-taking Description – xx –
A coffee conversation between foreign and
Coffee shop Everyday conversation Observer
local students about cultural differences
A presentation given by a professor to
International office Educational Actor
students about the foreign application process
A conversation about the differences between
Study in Spain Discussion Actor
transport and rental prices in different countries
periments with two HMDs: Samsung GearVR and 360VR videos with fluctuations of quality, simulating a VR
Lenovo Mirage Solo. streaming communication.
Presence questionnaire scores, specifically TPI (Lom-
• The contents used in the experiment showed simulated
bard et al.) and PQ (Witmer & Singer), obtained from conversations around a common topic: international experi-
48 participants. ences, i.e. working or studying abroad.
Statistical analysis notebook.
•
The main idea behind choosing this specific context was
The conclusions were used on the selection of the test our ability to gather a balanced sample of people who have
material with a longer duration to increase the immersive had international experiences and with people who have
experience, the HMD and the evaluation method (touchpad not. We acquired three contents with different acquisition
or handheld controller), and the methodology for video perspectives (actor and observer), each one on a different
quality and presence. Specifically, the fact that the fatigue genre (everyday conversation, educational, and discussion)
effect detected during the pilot study may affect the eval- about international experiences. For that, student volun-
uation of socioemotional features is a motivation for the teers were recruited for the recordings, both exchange and
methodology for video quality evaluation proposed in this national students from the university, making the conver-
experiment. Additionally, we present a comparison of the sations more realistic and fluent. Conversations were in
scores obtained during the video quality evaluation of the English, making the experiment accessible to different na-
pilot study and the experiment explained in detail in this tionalities and mother tongues and increasing the diversity
paper. of the sample.
The experiment considered three test conditions, sum-
3 WORK APPROACH marized in Table 1, and each participant was assigned a
condition. However, in all conditions, participants visual-
Based on the previous analysis, we pose the following ized the same Processed Video Sequences (PVSs). After each
Research Questions (RQs):
video, they were requested to rate its visual quality, as well
RQ1: Is it possible to evaluate video quality in videos as to evaluate the socioemotional features of interest: empa-
•
of long duration designed for the evaluation of so- thy and changes in attitude, spatial and social presence, and
cioemotional features? attention.
RQ2: Which technical aspects, such as the position
• Participants assigned to condition A had the additional
of the camera, the type of conversation, the video task of periodically rating the visual quality of the video
quality, the acquisition perspective, etc., influence during its playback, whenever its quality changed. This is a
socioemotional features? conventional design to evaluate the subjective quality of the
RQ3: Which interactive elements can be provided to
• video sequence under different intensities of impairment.
the remote client to improve some socialemotional However, this focused task might have impact on the eval-
concepts such as presence or attention? uation of socioemotional features compared to the baseline
To answer these RQs, we designed a subjective ex- scenario without the task (condition B).
periment where an immersive communication between a Finally, participants in condition C were provided with
provider and a remote client was simulated, presented in an additional interactivity element: the possibility to see
Figure 1. At the provider side, a conversation among several their own hands and take notes about the conversation. We
people took place, and the remote client attended virtually hypothesize that this could enhance socioemotional features
wearing an HMD. In the subjective test, the observer took such as presence and attention with respect to the other
the role of the remote client and visualized pre-recorded conditions.