Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
A new visual speech modelling approach for visual speech recognition

Yu, Dahai, Ghita, Ovidiu, Sutherland, Alistair and Whelan, Paul F. ORCID: 0000-0002-2029-1576 (2012) A new visual speech modelling approach for visual speech recognition. Journal of computing and information technology, 1 (1). pp. 1-11. ISSN 2161-7112

Full text available as:

[img] PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB

Abstract

In this paper we propose a new learning-based representation that is referred to as Visual Speech Unit (VSU) for visual speech recognition (VSR). The new Visual Speech Unit concept proposes an extension of the standard viseme model that is currently applied for VSR by including in this representation not only the data associated with the visemes, but also the transitory information between consecutive visemes. The developed speech recognition system consists of several computational stages: (a) lips segmentation, (b) construction of the Expectation-Maximization Principal Component Analysis (EM-PCA) manifolds from the input video image, (c) registration between the models of the VSUs and the EM-PCA data constructed from the input image sequence and (d) recognition of the VSUs using a standard Hidden Markov Model (HMM) classification scheme. In this paper we were particularly interested to evaluate the classification accuracy obtained for our new VSU models when compared with that attained for standard (MPEG-4) viseme models. The experimental results indicate that we achieved 90% recognition rate when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 52%.

Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:computer vision; Visual Speech Unit; VSU; visual speech recognition; VSR
Subjects:UNSPECIFIED
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Electronic Engineering
Publisher:Academy Publish
Official URL:http://www.academypublish.org/paper/a-new-visual-speech-modelling-approach-for-visual-speech-recognition
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:18543
Deposited On:16 Jul 2013 13:03 by Mark Sweeney . Last Modified 11 Jan 2019 13:32

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us