الجامعة الأردنية :: كليات اللغات :: الابحاث

الابحاث

Read my lips: Artificial intelligence word-level arabic lipreading system

Abstract
Lipreading is the ability to recognize words or sentences from the mouth movements of a speaking person. This process is also known as Visual Speech Recognition (VSR). Lipreading has two main advantages: facilitate communication for people with hearing or speaking problems and aid speech recognition in noisy environments. In this paper, we propose a lipreading computing system capable of recognizing ten common Arabic words by performing word extraction from the mouth movements. The system receives a video of a person uttering an Arabic word as an input and outputs the text of the predicted word. During the implementation stage of the proposed system, three deep learning and neural network architectures are alternatively used to train, validate, and test the system using a locally collected and preprocessed dataset. The dataset contains 1051 videos and will be made available upon request. Moreover, a voting model that combines the three architectures is proposed. The highest testing accuracy (i.e. 82.84%) is achieved by leveraging the voting model.​