Browse DORAS
Browse Theses
Search
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

A new visual speech modelling approach for visual speech recognition

Yu, Dahai and Ghita, Ovidiu and Sutherland, Alistair and Whelan, Paul F. (2012) A new visual speech modelling approach for visual speech recognition. Journal of computing and information technology, 1 (1). pp. 1-11. ISSN 2161-7112

Full text available as:

[img]PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1431Kb

Abstract

In this paper we propose a new learning-based representation that is referred to as Visual Speech Unit (VSU) for visual speech recognition (VSR). The new Visual Speech Unit concept proposes an extension of the standard viseme model that is currently applied for VSR by including in this representation not only the data associated with the visemes, but also the transitory information between consecutive visemes. The developed speech recognition system consists of several computational stages: (a) lips segmentation, (b) construction of the Expectation-Maximization Principal Component Analysis (EM-PCA) manifolds from the input video image, (c) registration between the models of the VSUs and the EM-PCA data constructed from the input image sequence and (d) recognition of the VSUs using a standard Hidden Markov Model (HMM) classification scheme. In this paper we were particularly interested to evaluate the classification accuracy obtained for our new VSU models when compared with that attained for standard (MPEG-4) viseme models. The experimental results indicate that we achieved 90% recognition rate when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 52%.

Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:Visual Speech Unit; VSU; visual speech recognition; VSR
Subjects:Engineering > Electronic engineering
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Publisher:Academy Publish
Official URL:http://www.academypublish.org/paper/a-new-visual-speech-modelling-approach-for-visual-speech-recognition
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:18543
Deposited On:16 Jul 2013 14:03 by Mark Sweeney. Last Modified 16 Jul 2013 14:03

Download statistics

Archive Staff Only: edit this record