Page 225 - Kaleidoscope Academic Conference Proceedings 2024
P. 225

Innovation and Digital Transformation for a Sustainable World




           integrate  the  Leap  Gesture  Dataset  was  guided  by  the   representations.  This  classification  process  is  essential  for
           understanding  that  a  holistic  approach,  combining  image   enabling the system to accurately interpret and respond to
           data and key hand points, is vital for training the CNN model   user  gestures,  thereby  facilitating  intuitive  and  seamless
           to get the class labels as intermediate output.    interaction within the smart home environment. Through the
                                                              integration  of  advanced  learning  techniques  and  dynamic
           In  the  context  of  our  gesture  classification  module,  the   learning  rate  adjustment,  the  system  achieves  robust  and
           system  design  goal  lies  in  constructing  and  training  a   efficient  gesture  recognition  capabilities,  enhancing  user
           Convolutional Neural Network (CNN) model for effective   experience and system performance.
           hand  gesture  recognition.  The  implementation  process
           commences by loading preprocessed data encompassing 3D   3.4   Class Probability Fusion and Integration with IoT
           pose information and their corresponding coarse labels for   devices
           various gestures. The model consists of convolutional layers,
           max-pooling layers, fully connected layers, ReLU activation   Following  the  fusion  process,  the  system  evaluates  the
           functions, and L2 regularization to prevent overfitting. The   probabilities associated with each class to determine the final
           model is subsequently compiled with the Adam optimizer   predicted  gesture.  This  strategic  fusion  of  predictions
           and a sparse categorical cross entropy loss function, making   ensures that the system maximizes its capability to capture
           it ready for training. The training process ensues, involving   diverse  aspects  of  gestures,  thereby  enhancing  overall
           30 epochs and monitoring performance against a validation   accuracy and reliability in classification tasks. By effectively
           set to ensure generalization. Early stopping is used to stop   leveraging the strengths of both channels, the system enables
           the model training when overfitting occurs.        seamless  and  intuitive  interaction  within  smart  home
                                                              environments, enhancing user experience.
           3.3   Transfer learning with dynamic learning rate
                                                              The  technique  behind  class  probability  fusion  is
           In  the  proposed  system,  once  the  region  of  interest  is   mathematically defined as follows:
           identified, it is passed to the next module, which employs a
           dynamic  learning  rate  adjustable  ResNet-based  transfer   Let  P(i)  represent  the  probability  of  class  i  predicted  by
           learning  model  for  gesture  classification.  The  ResNet   Attention based CNN Model (Model 1) and P(j) represent
           (Residual  Network)  architectures  are  renowned  for  their   the  probability  of  class  j  predicted  by  CNN  Model  with
           ability to effectively train deep neural networks, even with a   Dynamic learning (Model 2) where i, j = 1,2,3,4,5,…n where
           large number of layers. By leveraging transfer learning, the   n  is  the  number  of  classes.  The  final  predicted  gesture  is
           system capitalizes on pre-trained ResNet models, fine-tuning   determined by selecting the class with the highest probability
           them to recognize hand gestures specific to the application.   among the predictions from both models. This is expressed
           This  approach  significantly  reduces  the  training  time  and   as:
           computational  resources  required  to  achieve  high
           classification accuracy.                                   	
    =       (   ( ( ),  ( )))    (1)

           One  key  aspect  of  the  model  is  its  dynamic  learning  rate   Equation (1) computes the maximum probability among the
           adjustment  mechanism.  The  system  employs  a  piecewise   predictions from both models for each class and selects the
           learn rate schedule with a learn rate drop factor of 0.2. This   class  with  the  highest  maximum  probability  as  the  final
           factor allows for the systematic reduction of the learning rate   predicted gesture.
           during  training,  thereby  enabling  the  model  to  converge
           more effectively. The learn rate drop period is set to 1 epoch,   Once the gesture is successfully classified using the trained
           ensuring that the learning rate is updated at the end of each   model on the Raspberry Pi, the resulting class label serves as
           training epoch. Additionally, the initial learn rate is set to 1   a command to operate appliances via a relay system. This
              -4
           × 10 , providing an appropriate starting point for the training   relay system acts as an intermediary between the Raspberry
           process.                                           Pi  and  the  appliances,  enabling  seamless  integration  of
                                                              gesture-based  control  into  the  smart  home  environment.
           The dynamic adjustment of the learning rate is crucial for   With this setup, a wide range of appliances including fans,
           optimizing the training process and improving the model's   lights,  televisions,  and  air-conditioners  can  be  controlled
           performance over time. By gradually decreasing the learning   using hand gestures recognized by the proposed model. Each
           rate as training progresses, the system prevents the model   gesture class corresponds to a specific appliance operation,
           from getting stuck in local minima and facilitates smoother   allowing users to intuitively interact with their smart home
           convergence  towards  the  global  optimum.  This  dynamic   ecosystem without the need for physical switches or remote
           learning rate strategy ensures that the model can effectively   controls. This integration of gesture recognition system with
           adapt  to  the  complexities  of  the  gesture  recognition  task,   IoT based appliance control enhances user experience and
           ultimately leading to more accurate classification results.   facilitates  a  more  interactive  and  responsive  home
                                                              environment. The hardware experimental set up is shown in
           Finally,  the  output  of  the  ResNet-based  transfer  learning   Figure 2.
           model is used for gesture classification, where gestures are
           categorized  into  n  classes  based  on  their  visual





                                                          – 181 –
   220   221   222   223   224   225   226   227   228   229   230