Page 55 - Kaleidoscope Academic Conference Proceedings 2024
P. 55
Innovation and Digital Transformation for a Sustainable World
desired results for diseased potato leaves. healthy into json format to represent the data (coco
dataset format).
3.3.1 4_Layer CNN
An architecture of CNN with four convolutional layers
was built in order to serve as the backbone for the
Faster R-CNN (FRCNN). Three specific classes were
the focus of the model's training on the Plant Village
dataset. Early outcomes showed encouraging
possibilities for additional improvement. Categorical
cross-entropy was used as the loss function during
training, which took place over ten epochs using the
Adam optimizer with the learning rate set to 1e-3. To
Fig.1 Workflow diagram maximize feature extraction and model performance,
post-convolutional layers were successively followed
by batch normalization and max- pooling techniques.
3.1 Dataset 73.21% was the reported validation accuracy attained.
The 4-layer CNN architecture representation is shown
The proposed work has been accomplished using plant in Fig. 3.
Village dataset [12] available at Kaggle. Some
examples are shown in Fig. 2. The dataset consists of
3 classes viz. late_blight, early_blight and healthy.
The dataset contains 1000 images for the diseased
classes and 152 images for the healthy leaves. To
address the data imbalance problem the healthy
category data are augmented and increased to 1000
samples. These 3000 samples were used to train the
classifiers. Further, for detection 64 images after
augmentation and annotation
Fig.3 4-layer CNN
3.3.2 5_Layer CNN
The model architecture included five convolutional
layers and a convolutional transpose (conv transpose)
layer to improve feature extraction and map
Fig.2 Sample Dataset images adjustment. To improve model performance during
training, the Adam optimizer was used in conjunction
were taken to train the FRCNN model. All the images with a categorical cross-entropy loss function. The
were taken in JPG format for convenience with conv transpose layer was critical in properly
dimension being [256 x 256 x 3] pixels. The images constraining and refining the feature maps, allowing
are of high resolution which helps in clear detection for efficient information flow and spatial resolution
and segregation. adjustment throughout the network. The model was
trained for 10 epochs with a learning rate of 1e-3. This
3.2 Data Pre-processing meticulously tuned configuration produced a
remarkable 97% accuracy on the validation dataset.
The dataset considered is made sure to have a balance These findings highlight the importance of strategic
between each class. In order to do so, healthy plant hyper-parameter selection and the incorporation of
images were generated from existing dataset by advanced architectural components in achieving high
altering its hue and exposure. For the training of the performance and accuracy in image analysis tasks.
backbone classifier the dataset was split into training The 5-layer custom CNN architecture representation is
and validation sets in the ratio of 70:30 respectively. shown in Fig. 4.
For detection purposes the images were annotated
using roboflow.com. The datasets are annotated and
labeled into three classes: late_blight, early_blight and
– 11 –