Page 56 - Kaleidoscope Academic Conference Proceedings 2024
P. 56

2024 ITU Kaleidoscope Academic Conference




                                                                convolutional  layers enable  critical feature extraction
                                                                by employing a variety of filters for microscopic image
                                                                analysis,  which is supplemented by pooling and  fully
                                                                connected layers  for accurate classification.  The  RPN
                                                                component determines the presence of objects and













                                Fig.4 5-layer CNN


               3.3.3 Mobile Net

              The  Mobile  Net  architecture  as  shown  in  Fig.  5  was
              used  to  process input images of 224x224x3,  using an
              Adam  optimizer with  a  learning rate of 1e-3 over  a
              training period of 10 epochs. The model was trained           Fig.6 Architecture of FRCNN
              with a cross-  entropy loss function, yielding an
              accuracy of 78.43%. Mobile Net is based on depth-wise
              separable convolutions, a design technique for reducing   predicts precise bounding  boxes, while the fully
              computational complexity and model size while     connected neural network processes features to predict
              maintaining performance. This architecture is made up   object classes. Among the proposed architectures, the 5-
              of depth wise convolution followed by point-wise (1x1)   layer CNN  produced the  best results,  demonstrating
              convolution layers that efficiently capture spatial and   superior accuracy and feature extraction capabilities in
              channel-wise dependencies in the input data.      the detection  framework.  This  demonstrates  the
                                                                effectiveness  of specific design choices in  the 5-layer
                                                                CNN model for comprehensive object detection.

                                                                  A wide range of pre-trained and specially-designed
                                                                  models were  put through extensive testing in this
                                                                  thorough study in order to determine their
                                                                  effectiveness. The 5-layer convolutional neural
                                                                  network  (CNN) performed better than the  other
                                                                  models on all criterion, showing constant superiority
                                                                  over the others. The major contributions for the 5-
                                                                  layer custom CNN model are listed below:
                                                                      •   A  cross-entropy loss function combined
                                                                          with the Adam optimizer, trained across 10
                                                                          epochs with a precisely calibrated learning
                                                                          rate  of  1e-3,  were  the   specific
                                                                          hyperparameters that were used to
                                                                          methodically fine-tune the model.
                                                                      •   One  noteworthy aspect  of this CNN
                          Fig.5 Architecture of Mobile Net
                                                                          architecture  was the way the channel
                                                                          counts  were  gradually doubled in each
             3.3.4 DETECTION                                              layer until they reached a maximum of 128
                                                                          channels. This calculated increase in
             The  framework's  final  stage  makes  use  of  Faster  R-   channel capacity made it easier to create
                                                                          feature maps that are expressive and rich,
             CNN (F- RCNN), shown in Fig. 6 which has a unique            which allowed the model to identify subtle
             architecture  with two networks: a Region  Proposal          patterns and  nuances in  the  data and
             Network (RPN) and an  object detection  network, to          eventually  led  to  its  exceptional
             replace traditional selective  search methods. F-RCNN        performance in the experimental context.
             consists  of  three  major components. To begin,





                                                           – 12 –
   51   52   53   54   55   56   57   58   59   60   61