Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Prof Amir Hussain A.Hussain@napier.ac.uk
Professor
Dr Kia Dashtipour K.Dashtipour@napier.ac.uk
Lecturer
Dr. Mandar Gogate M.Gogate@napier.ac.uk
Senior Research Fellow
Speech enhancement aims to improve the overall quality and intelligibility of speech degraded by noise sources in real-world noisy environments. In recent years, researchers have proposed audio-visual speech enhancement models that go beyond traditional audio-only processing to provide better noise suppression and speech restoration in low SNR environments where multiple competing background noise sources are present. However, the audio-visual speech enhancement methods are language dependent as they exploit the correlations between visemes and the uttered speech. In addition, it has been shown that speaker pose variation significantly degrades the performance of these models.
This project aims to address the aforementioned two critical challenges with the current audio-visual speech enhancement models. The following research objectives will contribute to this development.
1. To design a novel multilingual audio-visual (AV) speech enhancement framework exploiting advanced machine learning techniques to address
2. To develop a novel multiview AV speech enhancement framework exploiting image translation and pose-invariant features.
3. Finally, we will integrate the two frameworks and critically evaluate the robustness and generalisation of the framework in a range of real world environments (e.g. cafeteria and restaurant) and use cases (e.g. car).
Project Acronym | Audio-visual speech enhancement |
---|---|
Status | Project Live |
Funder(s) | Royal Society |
Value | £12,000.00 |
Project Dates | Feb 17, 2023 - Feb 16, 2025 |
COG-MHEAR: Towards cognitively-inspired, 5G-IoT enabled multi-modal hearing aids Mar 1, 2021 - Feb 28, 2026
Embracing the multimodal nature of speech presents both opportunities and challenges for hearing assistive technology:
on the one hand there are opportunities for the design of new multimodal audio-visual (AV) algorithms; on the other hand,
Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms May 1, 2020 - Nov 17, 2020
: The project will aid in understanding and mitigating the direct and indirect impacts of the COVID-19 pandemic, by creating an AI-driven dashboard for policymakers, public health and clinical practitioners. This will enable continuous monitoring, pr...
Read More about Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms.
KTP Ace Aquatec Jun 1, 2021 - Nov 30, 2023
To develop, test and incorporate Innovative AI based approaches to improve accuracy of Individual fish identification and sea lice detection.
Security and Privacy in Vehicular Ad-hoc Networks Mar 1, 2022 - Sep 30, 2024
The great leap forward in wireless communications technology drives the recent advancements of Vehicular Ad hoc NETworks (VANETs). As a key part of the Intelligent Transportation Systems (ITS) framework, VANETs offer active road safety, and traffic e...
Read More about Security and Privacy in Vehicular Ad-hoc Networks.
Cross-lingual Audio-visual Speech Enhancement based on Deep Multimodal Learning Jun 1, 2023 - May 31, 2025
Speech enhancement and separation techniques are often used to improve the quality and intelligibility of speech degraded by background distractions, including speech and non-speech noises. We aim to change the current landscape of research and innov...
Read More about Cross-lingual Audio-visual Speech Enhancement based on Deep Multimodal Learning.
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
Apache License Version 2.0 (http://www.apache.org/licenses/)
Apache License Version 2.0 (http://www.apache.org/licenses/)
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search