Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation

Abel, A.; Hussain, A.; Nguyen, Q.-D.; Ringeval, F.; Chetouani, M.; Milgram, M.

doi:10.1007/978-3-642-04391-8_9

Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation

Abel, A.; Hussain, A.; Nguyen, Q.-D.; Ringeval, F.; Chetouani, M.; Milgram, M.

Authors

A. Abel

Prof Amir Hussain A.Hussain@napier.ac.uk
Professor

Q.-D. Nguyen

F. Ringeval

M. Chetouani

M. Milgram

Abstract

In recent years, the established link between the various human communication production domains has become more widely utilised in the field of speech processing. In this work, a state of the art Semi Adaptive Appearance Model (SAAM) approach developed by the authors is used for automatic lip tracking, and an adapted version of our vowel based speech segmentation system is employed to automatically segment speech. Canonical Correlation Analysis (CCA) on segmented and non segmented data in a range of noisy speech environments finds that segmented speech has a significantly better audiovisual correlation, demonstrating the feasibility of our techniques for further development as part of a proposed audiovisual speech enhancement system.

Citation

Abel, A., Hussain, A., Nguyen, Q.-D., Ringeval, F., Chetouani, M., & Milgram, M. (2009, September). Maximising audiovisual correlation with automatic lip tracking and vowel based segmentation. Presented at BioID: European Workshop on Biometrics and Identity Management, Madrid, Spain

Presentation Conference Type	Conference Paper (published)
Conference Name	BioID: European Workshop on Biometrics and Identity Management
Start Date	Sep 16, 2009
End Date	Sep 18, 2009
Publication Date	2009
Deposit Date	Oct 17, 2019
Publisher	Springer
Pages	65-72
Series Title	Lecture Notes in Computer Science
Series Number	5707
Series ISSN	0302-9743
Book Title	Biometric ID Management and Multimodal Communication
ISBN	978-3-642-04390-1
DOI	https://doi.org/10.1007/978-3-642-04391-8_9
Keywords	Canonical Correlation, Canonical Correlation Analysis, Noisy Environment, Speech Enhancement, Visual Speech
Public URL	http://researchrepository.napier.ac.uk/Output/1793513

Peeping into the Future: Understanding and Combating Generative AI-Based Fake News (2025)
Journal Article

Multi-scale integration with semantic embedding and adaptive excitation transformer for underwater optical image enhancement (2025)
Journal Article

A Novel Continual Learning and Adaptive Sensing State Response‐Based Target Recognition and Long‐Term Tracking Framework for Smart Industrial Applications (2025)
Journal Article

Arabic Short-text Dataset for Sentiment Analysis of Tourism and Leisure Events (2025)
Journal Article

Privacy-preserving Facial Emotion Classification with Visual Micro-Doppler Signatures for Hearing Aid Applications (2025)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

You might also like

Downloadable Citations