Page 223 - Kaleidoscope Academic Conference Proceedings 2024
P. 223

Innovation and Digital Transformation for a Sustainable World




           parameters  and  employs  a  CNN  neural  network  for   addition,  transfer  learning  model  is  trained  with  dynamic
           classification.  The  predictions  from  these  channels   learning rate and a probabilistic fusion is done to identify the
           integrated using a fusion mechanism allows the system to   gesture class.
           combine  the  insights  gained  from  both  approaches.  Upon
                                                              Input: Sixteen 3D tensors
           successful  gesture  classification,  the  system  triggers
           corresponding  operations  of  home  appliances  via  a  relay   Output: Gesture predicted
                                                              1. Begin
           connected  to  a  Raspberry  Pi,  thereby  controlling  their
                                                              2. Initialize a Sequential Model.
           functions through hand gestures.
                                                              3. Add convolutional and pooling layers to model:
                                                              3.1: Conv3D layer with parameters (filters=f, kernel_size=k,
           The algorithm for tensor extraction takes the frames of video          padding=p):
           as input and provides 16 3D tensors as output.
                                                                     Perform convolution: H_i = activation (Conv(H_{i-1};
           Input: ICVL Hand Gesture Dataset with depth images          f, k, p)) where H_0 = H.
           Output: Sixteen 3D tensor points                   3.2: For each MaxPooling 3D layer with parameters
           1. Begin                                                  (pool_size=s, padding=p): Apply pooling: H_i =
           2. Load the ICVL Hand Gesture Dataset.                    MaxPool(H_{i-1}; s, p).
           3. Apply median filtering as a part of data preprocessing.   3.3: Apply the Conv3D layer and MaxPooling3D layer twice
           4.  Ensure  uniform  sequence  lengths  through  padding  or          iteratively.
           truncation                                         4: Add flatten layer to model
           5. Apply 2D attention along with self-attention to the model   4.1: Flatten the output tensor H_final from the last Conv3D
           to get the feature tensors.                        or MaxPooling3D layer into a 1D tensor.
           5: Add dense layers to model
           6. Train the model with train dataset samples.
           7. Finetune the model with error rate and loss in extracting   5.1:  For  each  Dense  layer  with  units  u  and  activation  a:
           the feature tensors                                Compute  dense  output:  P(i)=  a(Dense(H_{i-1};  u))  where
           8. End                                             H_{i-1} = H_final
                                                              6: Applying Resnet with dynamic learning rate to provide
           The algorithm for gesture classification takes the tensors as   output: P(j)
           input and applies a CNN model for classifying gestures. In   7. Add the output layer to model and compute the class


                                                                            3D Attention
                                                                                                        CNN
                                                                            based CNN      Tensor
                                                                              Model       Extraction   Model
                                      Image Pre-
                                       processing
             Web Camera     Frame                     Region of Interest
               Feed       Extraction                   Identification                                 Class
                                                      HAAR Cascade          Transfer learning        Probability
                                                                           model with dynamic         Fusion
                                                                             learning rate



                                                                                                      Gesture
                                            Fan                                                     Classification




                                            Light
                                                                                                     Appliance
                                                                                                      Control
                                                                        Relay


                                             AC




                                             TV

                                                                     Figure 1 - Workflow









                                                          – 179 –
   218   219   220   221   222   223   224   225   226   227   228