AVSE Challenge: Audio-Visual Speech Enhancement Challenge

Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter

doi:10.1109/slt54892.2023.10023284

AVSE Challenge: Audio-Visual Speech Enhancement Challenge

Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter

Authors

Andrea Lorena Aldana Blanco

Cassia Valentini-Botinhao

Ondrej Klejch

Dr. Mandar Gogate M.Gogate@napier.ac.uk
Principal Research Fellow

Dr Kia Dashtipour K.Dashtipour@napier.ac.uk
Lecturer

Prof Amir Hussain A.Hussain@napier.ac.uk / hussain.doctor@gmail.com
Professor

Peter Bell

Abstract

Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.

Citation

Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. 2022 IEEE Spoken Language Technology Workshop (SLT) (pp. 465-471). IEEE. https://doi.org/10.1109/slt54892.2023.10023284

Presentation Conference Type	Conference Paper (published)
Conference Name	2022 IEEE Spoken Language Technology Workshop (SLT)
Start Date	Jan 9, 2023
End Date	Jan 12, 2023
Online Publication Date	Jan 27, 2023
Publication Date	2023
Deposit Date	May 17, 2023
Publicly Available Date	May 17, 2023
Publisher	Institute of Electrical and Electronics Engineers
Pages	465-471
Book Title	2022 IEEE Spoken Language Technology Workshop (SLT)
DOI	https://doi.org/10.1109/slt54892.2023.10023284
Public URL	http://researchrepository.napier.ac.uk/Output/3103205