Andrea Lorena Aldana Blanco
AVSE Challenge: Audio-Visual Speech Enhancement Challenge
Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter
Authors
Cassia Valentini-Botinhao
Ondrej Klejch
Dr. Mandar Gogate M.Gogate@napier.ac.uk
Principal Research Fellow
Dr Kia Dashtipour K.Dashtipour@napier.ac.uk
Lecturer
Prof Amir Hussain A.Hussain@napier.ac.uk / hussain.doctor@gmail.com
Professor
Peter Bell
Abstract
Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.
Citation
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. 2022 IEEE Spoken Language Technology Workshop (SLT) (pp. 465-471). IEEE. https://doi.org/10.1109/slt54892.2023.10023284
| Presentation Conference Type | Conference Paper (published) |
|---|---|
| Conference Name | 2022 IEEE Spoken Language Technology Workshop (SLT) |
| Start Date | Jan 9, 2023 |
| End Date | Jan 12, 2023 |
| Online Publication Date | Jan 27, 2023 |
| Publication Date | 2023 |
| Deposit Date | May 17, 2023 |
| Publicly Available Date | May 17, 2023 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Pages | 465-471 |
| Book Title | 2022 IEEE Spoken Language Technology Workshop (SLT) |
| DOI | https://doi.org/10.1109/slt54892.2023.10023284 |
| Public URL | http://researchrepository.napier.ac.uk/Output/3103205 |
Files
AVSE Challenge: Audio-Visual Speech Enhancement Challenge (accepted version)
(1.7 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Pruning Deep Neural Networks for Green Energy-Efficient Models: A Survey
(2024)
Journal Article
DNet-CNet: A novel cascaded deep network for real-time lane detection and classification
(2022)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search