Page 55 - Kaleidoscope Academic Conference Proceedings 2024
P. 55

Innovation and Digital Transformation for a Sustainable World




             desired results for diseased potato leaves.        healthy  into  json format  to represent  the data (coco
                                                                dataset format).



                                                                3.3.1 4_Layer CNN

                                                                An architecture of CNN with four convolutional layers
                                                                was  built in order to serve  as the  backbone  for the
                                                                Faster R-CNN (FRCNN). Three specific classes were
                                                                the focus of the model's training on the Plant Village
                                                                dataset. Early outcomes  showed  encouraging
                                                                possibilities  for additional  improvement. Categorical
                                                                cross-entropy  was  used as the loss function  during
                                                                training, which took place over ten epochs using the
                                                                Adam optimizer with the learning rate set to 1e-3. To
                           Fig.1 Workflow diagram               maximize feature extraction and model performance,
                                                                post-convolutional layers were  successively followed
                                                                by batch normalization and max- pooling techniques.
             3.1 Dataset                                        73.21% was the reported validation accuracy attained.
                                                                The 4-layer CNN architecture representation is shown
             The proposed work has been accomplished using plant   in Fig. 3.
             Village dataset  [12]  available  at Kaggle. Some
             examples are shown in Fig. 2. The dataset consists of
             3 classes viz. late_blight,  early_blight and  healthy.
             The dataset  contains  1000  images for the diseased
             classes and 152 images for the  healthy leaves. To
             address  the  data  imbalance  problem  the  healthy
             category data  are augmented and increased to 1000
             samples. These 3000 samples were used to train the
             classifiers. Further, for  detection  64 images after
             augmentation and annotation





                                                                                Fig.3 4-layer CNN
                                                                3.3.2 5_Layer CNN

                                                                The model architecture included  five  convolutional
                                                                layers and a convolutional transpose (conv transpose)
                                                                layer to improve  feature extraction  and map
                         Fig.2 Sample Dataset images            adjustment. To improve model  performance during
                                                                training,  the Adam optimizer was used in conjunction
             were taken to train the FRCNN model. All the images   with a categorical cross-entropy loss  function. The
             were  taken in JPG  format for  convenience  with   conv transpose layer was critical in properly
             dimension being [256 x 256 x 3] pixels. The images   constraining and  refining the feature maps, allowing
             are of high  resolution which helps in clear detection   for efficient information flow and spatial resolution
             and segregation.                                   adjustment throughout the network. The  model was
                                                                trained for 10 epochs with a learning rate of 1e-3. This
             3.2 Data Pre-processing                            meticulously  tuned  configuration  produced  a
                                                                remarkable  97%  accuracy on  the validation dataset.
             The dataset considered is made sure to have a balance   These findings highlight the importance of strategic
             between  each class. In  order to do so,  healthy plant   hyper-parameter selection and the incorporation  of
             images  were  generated  from  existing  dataset  by   advanced architectural components in achieving high
             altering its hue and exposure. For the training of the   performance and accuracy in image analysis tasks.
             backbone classifier  the dataset  was split into training   The 5-layer custom CNN architecture representation is
             and validation sets in the ratio of 70:30 respectively.   shown in Fig. 4.
             For  detection purposes the images were annotated
             using  roboflow.com. The datasets are annotated and
             labeled into three classes: late_blight, early_blight and





                                                           – 11 –
   50   51   52   53   54   55   56   57   58   59   60