Skip to main content

Research Repository

Advanced Search

Cross-lingual Audio-visual Speech Enhancement based on Deep Multimodal Learning

People Involved

Project Description

Speech enhancement and separation techniques are often used to improve the quality and intelligibility of speech degraded by background distractions, including speech and non-speech noises. We aim to change the current landscape of research and innovation in speech enhancement and separation by developing a novel multilingual audiovisual speech enhancement and separation framework based on English and Taiwanese Mandarin. Recently, there has been increasing interest in developing audio-only and audiovisual speech enhancement and separation models [1] to operate in real-world noisy environments, with an emphasis on English speakers. There have been studies in other languages, including Taiwanese Mandarin, but a number of formidable multilingual challenges are yet to be addressed [2, 3].
In this joint project, we will develop and practically evaluate a novel multilingual framework for speech enhancement and separation to reduce background noise and improve the performance of voice communication systems in real-world environments. We will consider challenging use cases e.g. human-robot interaction in very noisy environments, and automotive applications where a range of noises in cars, including music, phone ringing, car navigation sounds, children's noise, air conditioning and car audio systems are known to distract the driver and degrade the performance of hands-free voice communication and recognition systems. Our goal is to develop a first of its kind multilingual speech enhancement and separation framework and utilise quantitative and qualitative listening and comprehensibility tests to assess resulting improvements compared to benchmark approaches.
This joint proposal will build on ongoing collaborative research between two world-leading research groups at Edinburgh Napier University in Scotland, and the Research Center for Information Technology Innovation at Academia Sinica, Taipei, Taiwan. The outcomes of this project will be made openly available to both national and global research communities.

Project Acronym TACS
Status Project Live
Funder(s) Royal Society of Edinburgh
Value £12,000.00
Project Dates Jun 1, 2023 - May 31, 2025


You might also like

Actions demonstrate how Park4SUMP will lead to achieve sustainable transport in urban areas by strategically integrating innovative parking management solutions into SUMP policies

Actions demonstrate how Park4SUMP will lead to achieve sustainable transport in urban areas by strategically integrating innovative parking management solutions into SUMP policies Sep 1, 2018 - Oct 31, 2022
Parking management should be an important part of sustainable urban mobility planning (SUMP) but unfortunately, it is one
the most underdeveloped segments. Most EU member states lack national level policy and guidance on parking.
PARK4S... Read More about Actions demonstrate how Park4SUMP will lead to achieve sustainable transport in urban areas by strategically integrating innovative parking management solutions into SUMP policies.

COG-MHEAR: Towards cognitively-inspired, 5G-IoT enabled multi-modal hearing aids Mar 1, 2021 - Feb 28, 2026
Embracing the multimodal nature of speech presents both opportunities and challenges for hearing assistive technology:
on the one hand there are opportunities for the design of new multimodal audio-visual (AV) algorithms; on the other hand,Read More about COG-MHEAR: Towards cognitively-inspired, 5G-IoT enabled multi-modal hearing aids.

Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms May 1, 2020 - Nov 17, 2020
: The project will aid in understanding and mitigating the direct and indirect impacts of the COVID-19 pandemic, by creating an AI-driven dashboard for policymakers, public health and clinical practitioners. This will enable continuous monitoring, pr... Read More about Artificial Intelligence (AI)-powered dashboard for COVID-19 related public sentiment and opinion mining in social media platforms.

KTP Ace Aquatec Jun 1, 2021 - Nov 30, 2023
To develop, test and incorporate Innovative AI based approaches to improve accuracy of Individual fish identification and sea lice detection.