Page 56 - Kaleidoscope Academic Conference Proceedings 2024

P. 56

2024 ITU Kaleidoscope Academic Conference

convolutional layers enable critical feature extraction
by employing a variety of filters for microscopic image
analysis, which is supplemented by pooling and fully
connected layers for accurate classification. The RPN
component determines the presence of objects and

Fig.4 5-layer CNN

3.3.3 Mobile Net

The Mobile Net architecture as shown in Fig. 5 was
used to process input images of 224x224x3, using an
Adam optimizer with a learning rate of 1e-3 over a
training period of 10 epochs. The model was trained Fig.6 Architecture of FRCNN
with a cross- entropy loss function, yielding an
accuracy of 78.43%. Mobile Net is based on depth-wise
separable convolutions, a design technique for reducing predicts precise bounding boxes, while the fully
computational complexity and model size while connected neural network processes features to predict
maintaining performance. This architecture is made up object classes. Among the proposed architectures, the 5-
of depth wise convolution followed by point-wise (1x1) layer CNN produced the best results, demonstrating
convolution layers that efficiently capture spatial and superior accuracy and feature extraction capabilities in
channel-wise dependencies in the input data. the detection framework. This demonstrates the
effectiveness of specific design choices in the 5-layer
CNN model for comprehensive object detection.

A wide range of pre-trained and specially-designed
models were put through extensive testing in this
thorough study in order to determine their
effectiveness. The 5-layer convolutional neural
network (CNN) performed better than the other
models on all criterion, showing constant superiority
over the others. The major contributions for the 5-
layer custom CNN model are listed below:
• A cross-entropy loss function combined
with the Adam optimizer, trained across 10
epochs with a precisely calibrated learning
rate of 1e-3, were the specific
hyperparameters that were used to
methodically fine-tune the model.
• One noteworthy aspect of this CNN
Fig.5 Architecture of Mobile Net
architecture was the way the channel
counts were gradually doubled in each
3.3.4 DETECTION layer until they reached a maximum of 128
channels. This calculated increase in
The framework's final stage makes use of Faster R- channel capacity made it easier to create
feature maps that are expressive and rich,
CNN (F- RCNN), shown in Fig. 6 which has a unique which allowed the model to identify subtle
architecture with two networks: a Region Proposal patterns and nuances in the data and
Network (RPN) and an object detection network, to eventually led to its exceptional
replace traditional selective search methods. F-RCNN performance in the experimental context.
consists of three major components. To begin,

– 12 –

51 52 53 54 55 56 57 58 59 60 61