Page 120 - Kaleidoscope Academic Conference Proceedings 2024
P. 120

2024 ITU Kaleidoscope Academic Conference




           to  accomplish  this  classification.  1)  Machine  Learning   time-consuming  and  ineffective  [9].  Random  Forests,
           Approach; and 2) Lexicon Based Approach.           Support Vector Machines, Logistic Regression, Naive Bayes,
                                                              and  Decision  Trees  are  among  the  classifiers  applied  in
           The rest of the manuscript is structured as follows. Part II   sentiment analysis [10].
           outlines  SA  methods.  Section  III  contains  the  literature
           review.  Section  IV  offers  a  summary  of  the  framework,   Machine learning permits computers to pick up new abilities
           followed by a comprehensive report in Section V, which also   without  requiring  additional  programming.  Popular
           includes suggestions for future research.          techniques include Decision Trees (DT), Logistic Regression
                                                              (LR),  K-nearest  Neighbors  (KNN),  Support  Vector
                          2.  METHODOLOGY                     Machines  (SVM),  Naïve  Bayes  (NB),  Maximum  Entropy
                                                              (ME), semi-supervised and super-supervised learning.[11].
           Sentiment analysis is an automatic method of determining if   This  data  leads  them  to  the  conclusion  that,  in  all  cases,
           a  text  is  created  or  utilized  to  represent  a  consensus,  a   Logistic Regression outperforms the other classifiers, SVM
           negative,  or  a  satisfactory  opinion  of  the  subject  matter.   and  LR  based  classifiers  perform  well,  while  DT  based
           There  are  three  levels  of  the  sentiment  classification:   classifiers  provide  superior  accuracy  [12].  Specifically,
           Document  classification,  Sentence  classification,  Aspect   several  academics  have  examined  CNN  and  its  various
           classification, and Feature classification [7].    CNN+LSTM  combination.  Their  performance  was
                                                              contrasted  with  that  of  modern  models,  including
                                                              CNN/LSTM, KNN/NN, etc. With an accuracy rate of over
                                                              97%, this model outperforms all other models. In the CNN
                      • At the document level, all content is classified
            Document   as positive or negative, based on easily  LSTM  (F-Measure-88,  Mean-91,  and  Probability-96.32%)
              Level                                           fare better than the other models (F-Measures-92, Mean-93,
                       navigable data categories.
                                                              and   Probability-97.8%).   Their   evaluation   results
                                                              demonstrate  the  superiority  of  the  approach  over  all  deep
                                                              learning models, with BRCAN scoring the highest at 96.32%
                     • The sentiment classification further categorizes  [13].
                      sentences into classifications that are neutral,
             Sentence
              Level   positive, or negative. This allows for the
                      classification of sentences as both subjective and  2.2   Lexicon Based Approach
                      objective at the sentence level.
                                                              The  strategy  based  on  Lexicons  isolates  the  words  when
                                                              applying  lexicon-based  strategies  that  are  accessible  for  a
                      •This kind of emotion categorization talks about  specific text. Usually, scores are added together to achieve
              Aspect   identifying and removing item attributes from  this. Essentially, it is split into two sections: a) Dictionary-
               level
                       the original content.                  based  b)  corpus-based  [11].  Words  that  can  be  used  in
                                                              conjunction  with  sentiment  words,  such  as  increment,
                                                              decimals,  and  inverted  words,  were  not  counted  in  the
                                                              previous  uni-gram  lexicon  system;  only  sentiment  words
             Figure 1 – Classification Level of Sentiment Analysis
                                                              were [12]. SVM and lexicon-based classification are the two
                                                              methods used for analysis at the aspect level. Lexicon-based
           NLP is the study of Computer-Assisted Speech and language   models  are  84%  less  accurate  than  SVM  [14].  Sentiment
           Processing (CASP) and its application. NLP is extensively   analysis may be easily performed at the feature and sentence
           utilized  in  automated  inquiry,  machine  interpretation,  and   level  thanks  to  the  lexicon-based  approach.  It  might  be
           text mining. Previous studies shows that NLP uses a wide   considered an unsupervised method because the training data
           range of techniques in different domains and the results are   doesn't  need  to  be  processed  [11].  Lexicon  sentiment
           above  80%,  which  can  be  said  to  be  quite  impressive  in   analysis,  commonly  referred  to  as  "Dictionary  Sentiment
           comparison to Machine Learning [8]. For SA, the following   Analysis," is the process of analyzing an article's sentiment
           four  classification  approaches  are  implied:  1)  Machine   using  a  predetermined  list  of  terms  and  their  sentiment
           Learning  Method  2)  Lexicon-Based  Method  3)  Hybrid   ratings. The method is based on a lexicon or dictionary of
           Method 4) An alternative method.
                                                              words along with their polarity—that is, whether a term has
                                                              a  “positive”,  “negative”,  or  “neutral”  meaning.  One
           2.1   Machine Learning Approach                    advantage  of  the  Lexicon-based  method  is  that  it  doesn't
                                                              expect any training data, likewise some experts even refer to
           There are two main categories for machine learning models.   it as an unsupervised approach [15]. The primary drawback
           1) Supervised  Learning  2)  Unsupervised  Learning.  of  the  lexicon-based  method  is  its  strong  domain
           Supervised learning is the highest applied machine learning  specialization, which prevents terms from one domain from
           technique [8]. Using labeled source data, a model is trained  being utilized in another [16].
           using  this  technique.  When  fresh  unlabeled  input  data  is
           obtained,  the  trained  model  can  forecast  an  output.
           Supervised  learning  is  typically  more  effective  than
           unsupervised  or  semi-supervised  learning  techniques.
           However, depending only on labeled training data might be




                                                           – 76 –
   115   116   117   118   119   120   121   122   123   124   125