Page 457 - Kaleidoscope Academic Conference Proceedings 2024
P. 457

Innovation and Digital Transformation for a Sustainable World



                                                                applied to instances from the training data to
                                                                produce brand-new and  distinctive training
                                                                examples. The most well-known form of  data
                                                                augmentation,  known   as  "image   data
                                                                augmentation," is transforming images from the
                                                                training dataset into new copies that are
                                                                members of the same class as the original
                                                                image. Rotations flips, zooms, and other image
                                                                alteration techniques are  included in the
                                                                category of transforms.
                                                                DATASET:
                                                                The examination of the suggested model in this
                                                                research makes use of the Sign Language
                                                                MNIST        dataset.    Using       the
                                                                URLhttps://www.kaggle.com/datasets/datamung
                                                                e/sign-language-most, one can get the sign
                                                                MNIST dataset. This dataset is well-liked and
                          Fig2. Proposed our model              useful for recognizing gestures. It has 24,000
                                                                pictures, and 24  different gestures, in it. The
               Image Acquisition:  Basically camera, even a     dataset  consists   of   two.csv   files,
               laptop webcam can be  used  to acquire the image   sign_mnist_test.csv, and sign_mnist_train.csv,
               but in this study take hand gesture image publicly   which are used for testing  and training  the
               available at Kaggal.                             model,  respectively. The training  data set
               Pre-processing: The most important challenge     consists of 27,455 cases (80%), and the test data
               during the experiments is to find a suitable dataset   set has 7172 cases (20%).  Available data are for
               The dataset consists of various hand gesture images   testing and training the model total 34627. The
               at different conditions as lightning, varied     dataset is now accessible in testing and training
               backgrounds, dimensions, and So on. To make a    formats.  For targeting output, 784 features are
               real-time classification, the images are converted to   taken into account.   The label is the dataset's
               grayscale,  i.eThe pre-processing of theimage    first column. That is the objective. Label, which
               consists of the conversion of RGB to a  greyscale   has 24 distinct values, will be used as the target
               image. as shown in Fig.3 .Thus applying image    parameter. For multiclass problems, it is
               pre-processing reduces the number of parameters in   employed. Our data set is of the imbalance kind.
               the first convolutional layer and  reduces       We have 24 classes, but not every class has the
               computational requirements.                      same number of classes.  According to Figure 4,
                                                                Sample No. 17 has the highest value and
                                                                Sample No. 4 has the lowest value.
                                                                METHODOLOGY:
                                                                In our study, each image in the dataset has a
                                                                width and height of 28. This data set has been
                                                                developed to be useful to  people with hearing
                                                                and hearing difficulties. Some of the alphabets
                                                                in the data set are shown in Fig 4. In this study,
                                                                Fig 5 shows the typical architecture of CNN.
                                                                The proposed method we take 28*28 Image
                                                                Applying this CNN involves the following
                                                                steps: first, a picture is inputted (which is read
                                                                as an array of pixels); second, pre-processing is
                                                                shown in Fig 3 and filtering must be done; and
                                                                finally, the results are acquired after the
                                                                classification. shown in Fig5.The proposed
                                                                CNN framework is designed to obtain the best
               Fig3: Image Pre-processing   Fig4:   sampling    results for human static hand gesture
               of dataset                                       recognition. The framework architecture is
                                                                shown in Fig.5 In Convolutional neural
               Data Augmentation: With more data available,     networks that take input images and convolve
               deep learning neural  networks frequently        them with filters or kernels to extract features.
               perform better. Data augmentation is a method    In this paper, I will consider NxN image is
               for faking fresh training data out of old training   convolved with an  fXf filter and this
               data. To do this, domain-specific approaches are   convolution operation learns the same feature on






                                                          – 413 –
   452   453   454   455   456   457   458   459   460   461   462