Page 225 - Kaleidoscope Academic Conference Proceedings 2024

P. 225

Innovation and Digital Transformation for a Sustainable World

integrate the Leap Gesture Dataset was guided by the representations. This classification process is essential for
understanding that a holistic approach, combining image enabling the system to accurately interpret and respond to
data and key hand points, is vital for training the CNN model user gestures, thereby facilitating intuitive and seamless
to get the class labels as intermediate output. interaction within the smart home environment. Through the
integration of advanced learning techniques and dynamic
In the context of our gesture classification module, the learning rate adjustment, the system achieves robust and
system design goal lies in constructing and training a efficient gesture recognition capabilities, enhancing user
Convolutional Neural Network (CNN) model for effective experience and system performance.
hand gesture recognition. The implementation process
commences by loading preprocessed data encompassing 3D 3.4 Class Probability Fusion and Integration with IoT
pose information and their corresponding coarse labels for devices
various gestures. The model consists of convolutional layers,
max-pooling layers, fully connected layers, ReLU activation Following the fusion process, the system evaluates the
functions, and L2 regularization to prevent overfitting. The probabilities associated with each class to determine the final
model is subsequently compiled with the Adam optimizer predicted gesture. This strategic fusion of predictions
and a sparse categorical cross entropy loss function, making ensures that the system maximizes its capability to capture
it ready for training. The training process ensues, involving diverse aspects of gestures, thereby enhancing overall
30 epochs and monitoring performance against a validation accuracy and reliability in classification tasks. By effectively
set to ensure generalization. Early stopping is used to stop leveraging the strengths of both channels, the system enables
the model training when overfitting occurs. seamless and intuitive interaction within smart home
environments, enhancing user experience.
3.3 Transfer learning with dynamic learning rate
The technique behind class probability fusion is
In the proposed system, once the region of interest is mathematically defined as follows:
identified, it is passed to the next module, which employs a
dynamic learning rate adjustable ResNet-based transfer Let P(i) represent the probability of class i predicted by
learning model for gesture classification. The ResNet Attention based CNN Model (Model 1) and P(j) represent
(Residual Network) architectures are renowned for their the probability of class j predicted by CNN Model with
ability to effectively train deep neural networks, even with a Dynamic learning (Model 2) where i, j = 1,2,3,4,5,…n where
large number of layers. By leveraging transfer learning, the n is the number of classes. The final predicted gesture is
system capitalizes on pre-trained ResNet models, fine-tuning determined by selecting the class with the highest probability
them to recognize hand gestures specific to the application. among the predictions from both models. This is expressed
This approach significantly reduces the training time and as:
computational resources required to achieve high
classification accuracy.
= ( ( ( ), ( ))) (1)

One key aspect of the model is its dynamic learning rate Equation (1) computes the maximum probability among the
adjustment mechanism. The system employs a piecewise predictions from both models for each class and selects the
learn rate schedule with a learn rate drop factor of 0.2. This class with the highest maximum probability as the final
factor allows for the systematic reduction of the learning rate predicted gesture.
during training, thereby enabling the model to converge
more effectively. The learn rate drop period is set to 1 epoch, Once the gesture is successfully classified using the trained
ensuring that the learning rate is updated at the end of each model on the Raspberry Pi, the resulting class label serves as
training epoch. Additionally, the initial learn rate is set to 1 a command to operate appliances via a relay system. This
-4
× 10 , providing an appropriate starting point for the training relay system acts as an intermediary between the Raspberry
process. Pi and the appliances, enabling seamless integration of
gesture-based control into the smart home environment.
The dynamic adjustment of the learning rate is crucial for With this setup, a wide range of appliances including fans,
optimizing the training process and improving the model's lights, televisions, and air-conditioners can be controlled
performance over time. By gradually decreasing the learning using hand gestures recognized by the proposed model. Each
rate as training progresses, the system prevents the model gesture class corresponds to a specific appliance operation,
from getting stuck in local minima and facilitates smoother allowing users to intuitively interact with their smart home
convergence towards the global optimum. This dynamic ecosystem without the need for physical switches or remote
learning rate strategy ensures that the model can effectively controls. This integration of gesture recognition system with
adapt to the complexities of the gesture recognition task, IoT based appliance control enhances user experience and
ultimately leading to more accurate classification results. facilitates a more interactive and responsive home
environment. The hardware experimental set up is shown in
Finally, the output of the ResNet-based transfer learning Figure 2.
model is used for gesture classification, where gestures are
categorized into n classes based on their visual

– 181 –

220 221 222 223 224 225 226 227 228 229 230