Page 169 - Kaleidoscope Academic Conference Proceedings 2024
P. 169

Innovation and Digital Transformation for a Sustainable World




                                                              acquired from solving one task is leveraged to enhance
                                                              performance on another related task. In practical terms, this
                                                              often entails utilizing a pre-trained model, which has been
                                                              trained on a sizable dataset for a specific task, such as image
                                                              classification. The information ingrained in the pre-trained
                                                              model’s parameters (weights) is then transferred to a new
                                                              model configured for a different task. In this paper, transfer
                  Figure 2 – Sample images from DR class.
                                                              learning is employed to fine-tune two distinct CNN networks.
                                                              Since all these networks are pre-trained on the ImageNet
                                                              dataset, customization of the classification layer is necessary
                                                              to adapt them to our specific application.  The original
                                                              output layer is substituted with a new one tailored for binary
                                                              classification. Consequently, a single output neuron with
                                                              a sigmoid activation function is employed. Subsequently,
                                                              the model is compiled with an appropriate loss function.
                                                              Following this, the model undergoes training, during which
                 Figure 3 – Sample images from Normal class.  the weights are fine-tuned according to our dataset and task.
                                                              Finally, average accuracy and loss values are computed
                                                              to assess the model’s performance. The hyperparameters
           Hospital, situated in a rural setting. It is very popular dataset
                                                              employed during training are detailed in Table 1. Following
           among researchers for DR detection. Images, 2930 were
                                                              CNN models are used in this research work.
           allocated to the training set, while 366 were designated for
           both the validation and test sets. Within the training set, 1434
           images belong to the false (no DR) class, while 1496 images
                                                              3.3.1  VGG16
           belong to the true (DR) class. Similarly, the validation set
           comprises 172 false and 194 true class images. In the test
                                                              VGG16 stands as a prominent CNN architecture [13],
           set, there are 199 false and 167 true class images. Sample
                                                              pioneered by the Visual Geometry Group (VGG) at the
           images representing both class labels are illustrated in Fig. 2
                                                              University of Oxford. Its popularity for transfer learning
           and Fig. 3. All images in the dataset possess a resolution of
                                                              stems from its profound depth and robust feature extraction
           3216x2136.
                                                              capabilities. Comprising 16 weight layers, VGG16 includes
                                                              13 convolutional layers, 3 fully connected layers, and a
           3.2 Pre-processing
                                                              softmax layer for classification. The convolutional layers are
                                                              organized into blocks, with each block housing multiple layers
           To accommodate the requirements of CNN networks, all
                                                              dedicated to feature extraction, followed by a max-pooling
           images are resized to a standard size of 224x224x3.
                                                              layer for spatial dimension reduction and down-sampling.
           Additionally, a preprocessing step involves subtracting the
                                                              Notably, all convolutional layers share the same filter size
           mean across all images from each image in the training
                                                              and stride. The fully connected layers undertake the task of
           dataset. This normalization centers the data around zero,
                                                              learning high-level features and generating predictions based
           which not only aids in training convergence but also enhances
                                                              on the extracted features. Leveraging pre-trained weights
           performance. Furthermore, the color space is converted to the
                                                              from extensive datasets like ImageNet, this study fine-tunes
           BGR color space, and all pixel values are normalized to range
                                                              VGG16 on the APTOS dataset for binary classification,
           either between [0, 1] or [-1, 1]. This normalization enhances
                                                              rendering it a potent choice for DR image classification.
           the stability and convergence of the training process.
                                                              3.3.2  ResNet18
           3.3  Convolutional neural networks (CNN)
                                                              ResNet18 emerges as another widely adopted CNN
           Deep learning models designed for image processing leverage  architecture for transfer learning, renowned for its depth
           CNNs to autonomously acquire hierarchical representations  and incorporation of skip connections [14].  These
           from raw pixel data. These models are usually composed  skip connections effectively address the vanishing gradient
           of multiple layers, encompassing convolutional, pooling,  problem.  Comprising 18 layers, ResNet18 demonstrates
           and fully connected layers, enabling them to discern  robust performance in image classification tasks. Central
           and comprehend intricate patterns and features within  to its architecture are residual connections, which enable the
           images.  Pre-trained CNN architectures such as VGG,  learning of residual functions instead of directly mapping
           ResNet, and AlexNet serve as popular frameworks, which  inputs to outputs. ResNet18 architecture features a series
           can be fine-tuned to suit specific tasks.  These models  of convolutional layers and residual blocks, with each
           have demonstrated remarkable performance across various  block housing two convolutional layers, alongside batch
           image-related tasks, spanning classification, object detection,  normalization and ReLU activation functions. These residual
           and segmentation.                                  blocks facilitate the learning of residual representations,
           Transfer learning is a strategic method wherein knowledge  which are then merged with the original input via shortcut




                                                          – 125 –
   164   165   166   167   168   169   170   171   172   173   174