Résumé:
Considering that communication is essential for human connection, the deaf community
faces unique obstacles. Therefore, sign language is the best alternative for overcoming these
communication barriers, as it is considered the most effective means of communication,
involving many hand movements. However, sign language is often misunderstood by those
not part of the deaf community, necessitating the use of interpreters. This has led the
community to develop techniques to facilitate interpretation tasks. Despite progress in deep
learning, there is still limited research on recognizing and translating Arabic sign language.
This lack of research has prompted us to focus specifically on advancing studies in Arabic
sign language. This thesis introduces improved methodologies to construct a comprehensive
framework for processing, translating, and generating Arabic sign language from input
videos. We begin by utilizing the Mediapipe library for identifying human body parts. Then,
for sign language recognition, particularly in Arabic, we employed four distinct models:
YOLOv8, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and a
hybrid CNN-LSTM approach. Using the ArabSign-A dataset [59], we adapted it to focus on
individual words, achieving an accuracy of 87.37% for YOLOv8, 95.23% for the CNN
model, 88.09% % for the LSTM model, and 96.66% for the hybrid model. A comparative
analysis was conducted to evaluate our methodology, demonstrating superior discrimination
between static signs compared to prior research.