Page 223 - Kaleidoscope Academic Conference Proceedings 2024

P. 223

Innovation and Digital Transformation for a Sustainable World

parameters and employs a CNN neural network for addition, transfer learning model is trained with dynamic
classification. The predictions from these channels learning rate and a probabilistic fusion is done to identify the
integrated using a fusion mechanism allows the system to gesture class.
combine the insights gained from both approaches. Upon
Input: Sixteen 3D tensors
successful gesture classification, the system triggers
corresponding operations of home appliances via a relay Output: Gesture predicted
1. Begin
connected to a Raspberry Pi, thereby controlling their
2. Initialize a Sequential Model.
functions through hand gestures.
3. Add convolutional and pooling layers to model:
3.1: Conv3D layer with parameters (filters=f, kernel_size=k,
The algorithm for tensor extraction takes the frames of video padding=p):
as input and provides 16 3D tensors as output.
Perform convolution: H_i = activation (Conv(H_{i-1};
Input: ICVL Hand Gesture Dataset with depth images f, k, p)) where H_0 = H.
Output: Sixteen 3D tensor points 3.2: For each MaxPooling 3D layer with parameters
1. Begin (pool_size=s, padding=p): Apply pooling: H_i =
2. Load the ICVL Hand Gesture Dataset. MaxPool(H_{i-1}; s, p).
3. Apply median filtering as a part of data preprocessing. 3.3: Apply the Conv3D layer and MaxPooling3D layer twice
4. Ensure uniform sequence lengths through padding or iteratively.
truncation 4: Add flatten layer to model
5. Apply 2D attention along with self-attention to the model 4.1: Flatten the output tensor H_final from the last Conv3D
to get the feature tensors. or MaxPooling3D layer into a 1D tensor.
5: Add dense layers to model
6. Train the model with train dataset samples.
7. Finetune the model with error rate and loss in extracting 5.1: For each Dense layer with units u and activation a:
the feature tensors Compute dense output: P(i)= a(Dense(H_{i-1}; u)) where
8. End H_{i-1} = H_final
6: Applying Resnet with dynamic learning rate to provide
The algorithm for gesture classification takes the tensors as output: P(j)
input and applies a CNN model for classifying gestures. In 7. Add the output layer to model and compute the class

3D Attention
CNN
based CNN Tensor
Model Extraction Model
Image Pre-
processing
Web Camera Frame Region of Interest
Feed Extraction Identification Class
HAAR Cascade Transfer learning Probability
model with dynamic Fusion
learning rate

Gesture
Fan Classification

Light
Appliance
Control
Relay

Figure 1 - Workflow

– 179 –

218 219 220 221 222 223 224 225 226 227 228