Towards intelligibility-oriented audio-visual speech enhancement

Hussain, Tassadaq; Gogate, Mandar; Dashtipour, Kia; Hussain, Amir

All Outputs (4)

Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings. Presented at 2023 IEEE International Conference on A

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by utilizing multi-modal self-supervised embeddings.

Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings (2023)
Presentation / Conference Contribution
Chern, I., Hung, K., Chen, Y., Hussain, T., Gogate, M., Hussain, A., Tsao, Y., & Hou, J. (2023, June). Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings. Presented at 2023 IEEE International Conference on A

AV-HuBERT, a multi-modal self-supervised learning model, has been shown to be effective for categorical problems such as automatic speech recognition and lip-reading. This suggests that useful audio-visual speech representations can be obtained via u... Read More about Audio-visual speech enhancement and separation by leveraging multimodal self-supervised embeddings.

A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning (2022)
Presentation / Conference Contribution
Hussain, T., Diyan, M., Gogate, M., Dashtipour, K., Adeel, A., Tsao, Y., & Hussain, A. (2022, July). A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning. Presented at 2022 44th Annual International Conference

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, su... Read More about A Novel Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning.

Towards intelligibility-oriented audio-visual speech enhancement (2021)
Presentation / Conference Contribution
Hussain, T., Gogate, M., Dashtipour, K., & Hussain, A. (2021, September). Towards intelligibility-oriented audio-visual speech enhancement. Presented at The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021), Online

Existing deep learning (DL) based approaches are generally optimised to minimise the distance between clean and enhanced speech features. These often result in improved speech quality however they suffer from a lack of generalisation and may not deli... Read More about Towards intelligibility-oriented audio-visual speech enhancement.