Research Journal of Cell Sciences
Lip Reading with Deep Learning: A Comprehensive Analysis of Model Architectures
Abstract
Ahmed Cherif
Lip reading, a pivotal skill in augmenting communication for the hearing impaired, has seen significant advancements with deep learning techniques. This study presents a comprehensive analysis of various deep learning model architectures for lip reading using a newly constructed dataset, DATAV1. Our investigation explores and evaluates multiple architectures, including ResBlock3D, Conv3D, Conv2D, TimeDistributed, attention mechanism and LSTM. Through extensive experimentation and rigorous evaluation metrics, we identify and discuss one of the optimal architectures for accurate lip-reading performance, achieving a peak validation accuracy of 98.18%. This research contributes insights into effective model selection and lays groundwork for further advancements in enhancing human-machine communication through lip reading systems.

