Andrea Lorena Aldana Blanco
AVSE Challenge: Audio-Visual Speech Enhancement Challenge
Aldana Blanco, Andrea Lorena; Valentini-Botinhao, Cassia; Klejch, Ondrej; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir; Bell, Peter
Authors
Cassia Valentini-Botinhao
Ondrej Klejch
Dr. Mandar Gogate M.Gogate@napier.ac.uk
Principal Research Fellow
Dr Kia Dashtipour K.Dashtipour@napier.ac.uk
Lecturer
Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Peter Bell
Abstract
Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.
Citation
Aldana Blanco, A. L., Valentini-Botinhao, C., Klejch, O., Gogate, M., Dashtipour, K., Hussain, A., & Bell, P. (2023, January). AVSE Challenge: Audio-Visual Speech Enhancement Challenge. Presented at 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar
Presentation Conference Type | Conference Paper (Published) |
---|---|
Conference Name | 2022 IEEE Spoken Language Technology Workshop (SLT) |
Start Date | Jan 9, 2023 |
End Date | Jan 12, 2023 |
Online Publication Date | Jan 27, 2023 |
Publication Date | 2023 |
Deposit Date | May 17, 2023 |
Publicly Available Date | May 17, 2023 |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 465-471 |
Book Title | 2022 IEEE Spoken Language Technology Workshop (SLT) |
DOI | https://doi.org/10.1109/slt54892.2023.10023284 |
Public URL | http://researchrepository.napier.ac.uk/Output/3103205 |
Files
AVSE Challenge: Audio-Visual Speech Enhancement Challenge (accepted version)
(1.7 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
Statistical Downscaling Modeling for Temperature Prediction
(2024)
Book Chapter
Federated Learning for Market Surveillance
(2024)
Book Chapter
Robust Real-time Audio-Visual Speech Enhancement based on DNN and GAN
(2024)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search