Andrea Lorena Aldana Blanco
AVSE Challenge: Audio-Visual Speech Enhancement Challenge
Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter
Authors
Cassia Valentini-Botinhao
Ondrej Klejch
Dr. Mandar Gogate M.Gogate@napier.ac.uk
Principal Research Fellow
Dr Kia Dashtipour K.Dashtipour@napier.ac.uk
Lecturer
Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Peter Bell
Abstract
Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.
Citation
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023, January). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. Presented at 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2022 IEEE Spoken Language Technology Workshop (SLT) |
Start Date | Jan 9, 2023 |
End Date | Jan 12, 2023 |
Online Publication Date | Jan 27, 2023 |
Publication Date | 2023 |
Deposit Date | May 17, 2023 |
Publicly Available Date | May 17, 2023 |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 465-471 |
Book Title | 2022 IEEE Spoken Language Technology Workshop (SLT) |
DOI | https://doi.org/10.1109/slt54892.2023.10023284 |
Public URL | http://researchrepository.napier.ac.uk/Output/3103205 |
Files
AVSE Challenge: Audio-Visual Speech Enhancement Challenge (accepted version)
(1.7 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Robust Real-time Audio-Visual Speech Enhancement based on DNN and GAN
(2024)
Journal Article
Arabic Sentiment Analysis Based on Word Embeddings and Deep Learning
(2023)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search