Page 168 - Kaleidoscope Academic Conference Proceedings 2024
P. 168

2024 ITU Kaleidoscope Academic Conference




           necessitating improvements in model interpretability,
           trustworthiness, and cost-effectiveness. Transformer model
           of Yang et al. [2] with multiple instance learning (TMIL)
           has segmented images into 224×224 patches for feature
           extraction. This approach preserves valuable information
           and leverages pre-trained weights without performance
           loss. Experimental results show TMIL outperforms existing
           methods, improving classification accuracy and reducing
           inference time by 62% on the APTOS and Messidor-1
           datasets. Lahmar et al. [3] conducted a comparative study of
           seven pre-trained CNN models for the binary classification
           of DR. Evaluating these models on various parameters using
           DR datasets, they found that MobileNetV2 achieved the
           highest accuracy, scoring 93.09% on the APTOS dataset.
           Lahmar et al. [4] conducted an extensive study assessing
           the performance of 28 hybrid deep learning architectures and
           7 standalone deep learning models for binary classification
           of DR. Their comprehensive evaluation aimed to identify
           the most effective methods for distinguishing between the
           presence and absence of DR, providing valuable insights into
           automated DR detection.
           Kassani et al.  [5] introduced a novel feature extraction
           technique for DR diagnosis using a customized Xception
           architecture.  Leveraging the deep layer accumulation
           characteristic of Xception, this approach efficiently extracts
           intricate features from retinal images. The extracted features
           are subsequently fed into a multi-layer perceptron, offering
           a robust classification method. Evaluation on the APTOS
           dataset demonstrated an accuracy of 83.09%, highlighting its
           promise for dependable DR detection in clinical settings. In
           [6], a machine learning-based method is proposed for early
           DR detection employing the Inception V3 model. Trained and
           tested on the EyePACS and APTOS 2019 datasets, their model
           attained an accuracy of 81.61% and an F1 score of 80.21%
           on the APTOS 2019 dataset, demonstrating its effectiveness
           in DR detection.
           In [7], intermediate layers of the DenseNet-121 model are
           utilized for feature extraction. In [8], a CNN-based model  Figure 1 – Block diagram of the proposed method.
           is introduced for detecting and categorizing DR. Utilizing
           the APTOS dataset, the model attained an accuracy of
                                                              learning due to its optimal number of learning layers, which
           97% and 93% with CNN and AlexNet models, respectively,
                                                              accelerates training speed.  The schematic representation
           following training and validation on distinct datasets. Farag
                                                              of our proposed method is illustrated in Fig. 1. Initially,
           et al. [9] proposed a novel method utilizing DenseNet169
                                                              the dataset is loaded and divided into training, testing, and
           to automatically assess the severity of DR. They leveraged
                                                              validation sets. The pre-trained VGG16 network is then
           DenseNet169’s encoder to generate visual embeddings and
                                                              loaded and its input and classification layers are adapted to
           integrated an attention module to improve discrimination
                                                              suit the task. Subsequently, appropriate hyper-parameters
           capability. Dhir et al. [10] modified neural networks and
                                                              are chosen, and the network undergoes training. Following
           evaluated their performance on the APTOS dataset. The
                                                              transfer learning, features are extracted from the fully
           study evaluates five deep learning architectures using 1228
                                                              connected layer 7 (FC7). To optimize feature selection, a KW
           fundus images from six datasets, including DR, Glaucoma,
                                                              test is applied, and machine learning classifiers are employed
           and Cataract.  EfficientNetB0 outperforms other models
                                                              for the classification task using the selected features. Detailed
           significantly surpassing previous studies like Fast-RCNN and
                                                              explanations of each step are provided below:
           InceptionResNet [11].
                                                              3.1 Dataset
                         3.  METHODOLOGY
                                                              This study utilized the APTOS dataset [12] for binary
           This study employs a straightforward Convolutional Neural  classification purposes. This dataset comprises retinal images
           Network (CNN) models, VGG16 and ResNet18 for transfer  captured by a fundus camera operated by Aravind Eye
                                                          – 124 –
   163   164   165   166   167   168   169   170   171   172   173