Skip to main content

Research Repository

Advanced Search

Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation

Abel, A.; Hussain, A.; Nguyen, Q.-D.; Ringeval, F.; Chetouani, M.; Milgram, M.

Authors

A. Abel

Q.-D. Nguyen

F. Ringeval

M. Chetouani

M. Milgram



Abstract

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM) approach developed by the authors is used for automatic lip tracking, and an adapted version of our vowel based speech segmentation system is employed to automatically segment speech. Canonical Correlation Analysis (CCA) on segmented and non segmented data in a range of noisy speech environments finds that segmented speech has a significantly better audiovisual correlation, demonstrating the feasibility of our techniques for further development as part of a proposed audiovisual speech enhancement system.

Citation

Abel, A., Hussain, A., Nguyen, Q., Ringeval, F., Chetouani, M., & Milgram, M. (2009, September). Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation. Presented at BioID: European Workshop on Biometrics and Identity Management, Madrid, Spain

Presentation Conference Type Conference Paper (Published)
Conference Name BioID: European Workshop on Biometrics and Identity Management
Start Date Sep 16, 2009
End Date Sep 18, 2009
Publication Date 2009
Deposit Date Oct 17, 2019
Publisher Springer
Pages 65-72
Series Title Lecture Notes in Computer Science
Series Number 5707
Series ISSN 0302-9743
Book Title Biometric ID Management and Multimodal Communication
ISBN 978-3-642-04390-1
DOI https://doi.org/10.1007/978-3-642-04391-8_9
Keywords Canonical Correlation, Canonical Correlation Analysis, Noisy Environment, Speech Enhancement, Visual Speech
Public URL http://researchrepository.napier.ac.uk/Output/1793513