Page 120 - Kaleidoscope Academic Conference Proceedings 2024
P. 120
2024 ITU Kaleidoscope Academic Conference
to accomplish this classification. 1) Machine Learning time-consuming and ineffective [9]. Random Forests,
Approach; and 2) Lexicon Based Approach. Support Vector Machines, Logistic Regression, Naive Bayes,
and Decision Trees are among the classifiers applied in
The rest of the manuscript is structured as follows. Part II sentiment analysis [10].
outlines SA methods. Section III contains the literature
review. Section IV offers a summary of the framework, Machine learning permits computers to pick up new abilities
followed by a comprehensive report in Section V, which also without requiring additional programming. Popular
includes suggestions for future research. techniques include Decision Trees (DT), Logistic Regression
(LR), K-nearest Neighbors (KNN), Support Vector
2. METHODOLOGY Machines (SVM), Naïve Bayes (NB), Maximum Entropy
(ME), semi-supervised and super-supervised learning.[11].
Sentiment analysis is an automatic method of determining if This data leads them to the conclusion that, in all cases,
a text is created or utilized to represent a consensus, a Logistic Regression outperforms the other classifiers, SVM
negative, or a satisfactory opinion of the subject matter. and LR based classifiers perform well, while DT based
There are three levels of the sentiment classification: classifiers provide superior accuracy [12]. Specifically,
Document classification, Sentence classification, Aspect several academics have examined CNN and its various
classification, and Feature classification [7]. CNN+LSTM combination. Their performance was
contrasted with that of modern models, including
CNN/LSTM, KNN/NN, etc. With an accuracy rate of over
97%, this model outperforms all other models. In the CNN
• At the document level, all content is classified
Document as positive or negative, based on easily LSTM (F-Measure-88, Mean-91, and Probability-96.32%)
Level fare better than the other models (F-Measures-92, Mean-93,
navigable data categories.
and Probability-97.8%). Their evaluation results
demonstrate the superiority of the approach over all deep
learning models, with BRCAN scoring the highest at 96.32%
• The sentiment classification further categorizes [13].
sentences into classifications that are neutral,
Sentence
Level positive, or negative. This allows for the
classification of sentences as both subjective and 2.2 Lexicon Based Approach
objective at the sentence level.
The strategy based on Lexicons isolates the words when
applying lexicon-based strategies that are accessible for a
•This kind of emotion categorization talks about specific text. Usually, scores are added together to achieve
Aspect identifying and removing item attributes from this. Essentially, it is split into two sections: a) Dictionary-
level
the original content. based b) corpus-based [11]. Words that can be used in
conjunction with sentiment words, such as increment,
decimals, and inverted words, were not counted in the
previous uni-gram lexicon system; only sentiment words
Figure 1 – Classification Level of Sentiment Analysis
were [12]. SVM and lexicon-based classification are the two
methods used for analysis at the aspect level. Lexicon-based
NLP is the study of Computer-Assisted Speech and language models are 84% less accurate than SVM [14]. Sentiment
Processing (CASP) and its application. NLP is extensively analysis may be easily performed at the feature and sentence
utilized in automated inquiry, machine interpretation, and level thanks to the lexicon-based approach. It might be
text mining. Previous studies shows that NLP uses a wide considered an unsupervised method because the training data
range of techniques in different domains and the results are doesn't need to be processed [11]. Lexicon sentiment
above 80%, which can be said to be quite impressive in analysis, commonly referred to as "Dictionary Sentiment
comparison to Machine Learning [8]. For SA, the following Analysis," is the process of analyzing an article's sentiment
four classification approaches are implied: 1) Machine using a predetermined list of terms and their sentiment
Learning Method 2) Lexicon-Based Method 3) Hybrid ratings. The method is based on a lexicon or dictionary of
Method 4) An alternative method.
words along with their polarity—that is, whether a term has
a “positive”, “negative”, or “neutral” meaning. One
2.1 Machine Learning Approach advantage of the Lexicon-based method is that it doesn't
expect any training data, likewise some experts even refer to
There are two main categories for machine learning models. it as an unsupervised approach [15]. The primary drawback
1) Supervised Learning 2) Unsupervised Learning. of the lexicon-based method is its strong domain
Supervised learning is the highest applied machine learning specialization, which prevents terms from one domain from
technique [8]. Using labeled source data, a model is trained being utilized in another [16].
using this technique. When fresh unlabeled input data is
obtained, the trained model can forecast an output.
Supervised learning is typically more effective than
unsupervised or semi-supervised learning techniques.
However, depending only on labeled training data might be
– 76 –