Page 223 - Kaleidoscope Academic Conference Proceedings 2024
P. 223
Innovation and Digital Transformation for a Sustainable World
parameters and employs a CNN neural network for addition, transfer learning model is trained with dynamic
classification. The predictions from these channels learning rate and a probabilistic fusion is done to identify the
integrated using a fusion mechanism allows the system to gesture class.
combine the insights gained from both approaches. Upon
Input: Sixteen 3D tensors
successful gesture classification, the system triggers
corresponding operations of home appliances via a relay Output: Gesture predicted
1. Begin
connected to a Raspberry Pi, thereby controlling their
2. Initialize a Sequential Model.
functions through hand gestures.
3. Add convolutional and pooling layers to model:
3.1: Conv3D layer with parameters (filters=f, kernel_size=k,
The algorithm for tensor extraction takes the frames of video padding=p):
as input and provides 16 3D tensors as output.
Perform convolution: H_i = activation (Conv(H_{i-1};
Input: ICVL Hand Gesture Dataset with depth images f, k, p)) where H_0 = H.
Output: Sixteen 3D tensor points 3.2: For each MaxPooling 3D layer with parameters
1. Begin (pool_size=s, padding=p): Apply pooling: H_i =
2. Load the ICVL Hand Gesture Dataset. MaxPool(H_{i-1}; s, p).
3. Apply median filtering as a part of data preprocessing. 3.3: Apply the Conv3D layer and MaxPooling3D layer twice
4. Ensure uniform sequence lengths through padding or iteratively.
truncation 4: Add flatten layer to model
5. Apply 2D attention along with self-attention to the model 4.1: Flatten the output tensor H_final from the last Conv3D
to get the feature tensors. or MaxPooling3D layer into a 1D tensor.
5: Add dense layers to model
6. Train the model with train dataset samples.
7. Finetune the model with error rate and loss in extracting 5.1: For each Dense layer with units u and activation a:
the feature tensors Compute dense output: P(i)= a(Dense(H_{i-1}; u)) where
8. End H_{i-1} = H_final
6: Applying Resnet with dynamic learning rate to provide
The algorithm for gesture classification takes the tensors as output: P(j)
input and applies a CNN model for classifying gestures. In 7. Add the output layer to model and compute the class
3D Attention
CNN
based CNN Tensor
Model Extraction Model
Image Pre-
processing
Web Camera Frame Region of Interest
Feed Extraction Identification Class
HAAR Cascade Transfer learning Probability
model with dynamic Fusion
learning rate
Gesture
Fan Classification
Light
Appliance
Control
Relay
AC
TV
Figure 1 - Workflow
– 179 –