Skip to main content

Research Repository

Advanced Search

Towards multilingual audio-visual speech enhancement in real noisy environments

People Involved

Project Description

Speech enhancement aims to improve the overall quality and intelligibility of speech degraded by noise sources in real-world noisy environments. In recent years, researchers have proposed audio-visual speech enhancement models that go beyond traditional audio-only processing to provide better noise suppression and speech restoration in low SNR environments where multiple competing background noise sources are present. However, the audio-visual speech enhancement methods are language dependent as they exploit the correlations between visemes and the uttered speech. In addition, it has been shown that speaker pose variation significantly degrades the performance of these models.
This project aims to address the aforementioned two critical challenges with the current audio-visual speech enhancement models. The following research objectives will contribute to this development.

1. To design a novel multilingual audio-visual (AV) speech enhancement framework exploiting advanced machine learning techniques to address
2. To develop a novel multiview AV speech enhancement framework exploiting image translation and pose-invariant features.
3. Finally, we will integrate the two frameworks and critically evaluate the robustness and generalisation of the framework in a range of real world environments (e.g. cafeteria and restaurant) and use cases (e.g. car).

Project Acronym Audio-visual speech enhancement
Status Project Live
Funder(s) Royal Society
Value £12,000.00
Project Dates Feb 17, 2023 - Feb 16, 2025



You might also like

COG-MHEAR: Towards cognitively-inspired, 5G-IoT enabled multi-modal hearing aids Mar 1, 2021 - Feb 28, 2026
Embracing the multimodal nature of speech presents both opportunities and challenges for hearing assistive technology:
on the one hand there are opportunities for the design of new multimodal audio-visual (AV) algorithms; on the other hand,Read More about COG-MHEAR: Towards cognitively-inspired, 5G-IoT enabled multi-modal hearing aids.

Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms May 1, 2020 - Nov 17, 2020
: The project will aid in understanding and mitigating the direct and indirect impacts of the COVID-19 pandemic, by creating an AI-driven dashboard for policymakers, public health and clinical practitioners. This will enable continuous monitoring, pr... Read More about Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms.

KTP Ace Aquatec Jun 1, 2021 - Nov 30, 2023
To develop, test and incorporate Innovative AI based approaches to improve accuracy of Individual fish identification and sea lice detection.