Skip to main content

Research Repository

Advanced Search

Deep Neural Network Driven Binaural Audio Visual Speech Separation

Gogate, Mandar; Dashtipour, Kia; Bell, Peter; Hussain, Amir

Authors

Peter Bell



Abstract

The central auditory pathway exploits the auditory signals and visual information sent by both ears and eyes to segregate speech from multiple competing noise sources and help disambiguate phonological ambiguity. In this study, inspired from this unique human ability, we present a deep neural network (DNN) that ingest the binaural sounds received at the two ears as well as the visual frames to selectively suppress the competing noise sources individually at both ears. The model exploits the noisy binaural cues and noise robust visual cues to improve speech intelligibility. The comparative simulation results in terms of objective metrics such as PESQ, STOI, SI-SDR and DBSTOI demonstrate significant performance improvement of the proposed audio-visual (AV) DNN as compared to the audio-only (A-only) variant of the proposed model. Finally, subjective listening tests with the real noisy AV ASPIRE corpus shows the superiority of the proposed AV DNN as compared to state-of-the-art approaches.

Citation

Gogate, M., Dashtipour, K., Bell, P., & Hussain, A. (2020, July). Deep Neural Network Driven Binaural Audio Visual Speech Separation. Presented at 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow

Presentation Conference Type Conference Paper (published)
Conference Name 2020 International Joint Conference on Neural Networks (IJCNN)
Start Date Jul 19, 2020
End Date Jul 24, 2020
Online Publication Date Sep 28, 2020
Publication Date 2020
Deposit Date Apr 15, 2021
Publisher Institute of Electrical and Electronics Engineers
Peer Reviewed Peer Reviewed
Pages 1-7
Series ISSN 2161-4407
Book Title 2020 International Joint Conference on Neural Networks (IJCNN)
ISBN 9781728169262
DOI https://doi.org/10.1109/ijcnn48605.2020.9207517
Public URL http://researchrepository.napier.ac.uk/Output/2761846