Page 210 - Kaleidoscope Academic Conference Proceedings 2024
P. 210
2024 ITU Kaleidoscope Academic Conference
Figure 5 – Average SSI for each mode of communication
Figure 7 – Canny edge of the original image
Figure 6 – Original image
3.2 Data Reduction
Figure 8 – Line art of the original image
Data reductions were computed for a solitary image across map of the original image.
several categories: caption only, caption + line art, caption +
canny edge, and depth map + caption. The image measured
728x492 pixels, totaling 358,176 pixels. With 8 bits per pixel,
each channel required 2,865,408 bits, summing to 8,596,224
bits for all three channels (RGB). The original image is shown
in Figure 6.
The caption generated from the model, "man in a hat holding
two guns in his hands," requires 312 bits of data. This
calculation was based on the standard ASCII encoding, which
uses 8 bits to represent each character. Therefore, with 39
characters in the caption, the total number of bits needed is
39 characters * 8 bits/character = 312 bits.
Canny edge detection, applied to an image of (728, 492)
pixels, demands 358,176 bits. This succinct representation Figure 9 – Depth map of the original image
encapsulates the data size necessary for accurately depicting
detected edges, a critical aspect for image analysis. The canny
Figure 10 illustrates the data requirements for various
edge of the original image is depicted in Figure 7.
modes of end-to-end image data transfer. This visualization
Producing line art for an image with dimensions of (728,
underscores the potential sustainability offered by semantic
492) pixels requires 2,865,408 bits of data. This dataset,
communication compared to conventional methods. By
encompassing 358,176 pixels at 8 bits per pixel, faithfully
selectively transmitting only meaningful data, semantic
conveys the line art details. Figure 8 illustrates the resulting
communication reduces the overall data quantity required
line art derived from the original image.
for communication. Additionally, incorporating line art as
Generating a depth map for an image measuring (728, 492)
a second mode yields improved performance metrics such
pixels requires 2,865,408 bits of data. This data comprises as MSE, PSNR, and SSI, while also contributing to a
358,176 pixels at 8 bits per pixel, accurately representing significant decrease in the data used for communication. Our
depth information for each pixel. Figure 9 displays the depth results suggest that line art achieves an ideal equilibrium
– 166 –