Résumé:
Amidst the current global landscape, sentiment analysis has gained remarkable
significance across diverse domains spanning social, economic, sports, political, and
commercial sectors. The profusion of varied textual content on social media platforms
underscores the importance of comprehending these narratives, crucial for deciphering
societal trends, meeting citizen needs, and mitigating offensive content. This significance
amplifies particularly in dialectal languages characterized by nuanced variations in
morphology and meanings.
This master thesis aims to address the challenges posed by the limited size of dialectal
Algerian datasets and enhance the accuracy of sentiment detection. The primary objective of
this research is to compare the performance of different machine learning, deep learning, and
transformers classifiers in analyzing and categorizing a collection of social media posts in the
Algerian dialect into two classes: positive or negative sentiment.
By leveraging a novel dataset and employing advanced methodologies, the study aims to
provide insights into the efficacy of various algorithms in this task. This endeavor contributes
towards advancing sentiment analysis research within the Algerian dialect context, offering
valuable implications for real-world applications and decision-making processes.
The achieved results are highly promising, showcasing the best accuracy rate of 87.9% in
the Bert base model. In addition to an 85% accuracy employing the LSTM model and 82%
accuracy with the SVM linear classifier.
Moreover, the ensemble approaches, which employed aragpt2, arabicBert base, and distilbert,
further improved the performance. The majority voting ensemble achieved an accuracy of 90%,
and the stacking ensemble attained an accuracy of 91.1%.