Research Journal of Cell Sciences

Lip Reading with Deep Learning: A Comprehensive Analysis of Model Architectures

Abstract

Ahmed Cherif

Lip reading, a pivotal skill in augmenting communication for the hearing impaired, has seen significant advancements with deep learning techniques. This study presents a comprehensive analysis of various deep learning model architectures for lip reading using a newly constructed dataset, DATAV1. Our investigation explores and evaluates multiple architectures, including ResBlock3D, Conv3D, Conv2D, TimeDistributed, attention mechanism and LSTM. Through extensive experimentation and rigorous evaluation metrics, we identify and discuss one of the optimal architectures for accurate lip-reading performance, achieving a peak validation accuracy of 98.18%. This research contributes insights into effective model selection and lays groundwork for further advancements in enhancing human-machine communication through lip reading systems.

PDF

Journal key Highlights