Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition

Yue, Zhengjun; Loweimi, Erfan; Cvetkovic, Zoran; Christensen, Heidi; Barker, Jon

doi:10.1109/icassp43922.2022.9746855

Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition

Yue, Zhengjun; Loweimi, Erfan; Cvetkovic, Zoran; Christensen, Heidi; Barker, Jon

Authors

Zhengjun Yue

Erfan Loweimi

Zoran Cvetkovic

Heidi Christensen

Jon Barker

Abstract

Building automatic speech recognition (ASR) systems for speakers with dysarthria is a very challenging task. Although multi-modal ASR has received increasing attention recently, incorporating real articulatory data with acoustic features has not been widely explored in the dysarthric speech community. This paper investigates the effectiveness of multi-modal acoustic modelling for dysarthric speech recognition using acoustic features along with articulatory information. The proposed multi-stream architectures consist of convolutional, recurrent and fully-connected layers allowing for bespoke per-stream pre-processing, fusion at the optimal level of abstraction and post-processing. We study the optimal fusion level/scheme as well as training dynamics in terms of cross-entropy and WER using the popular TORGO dysarthric speech database. Experimental results show that fusing the acoustic and articulatory features at the empirically found optimal level of abstraction achieves a remarkable performance gain, leading to up to 4.6% absolute (9.6% relative) WER reduction for speakers with dysarthria.

Citation

Yue, Z., Loweimi, E., Cvetkovic, Z., Christensen, H., & Barker, J. (2022, May). Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition. Presented at ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore

Presentation Conference Type	Conference Paper (Published)
Conference Name	ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Start Date	May 23, 2022
End Date	May 27, 2022
Online Publication Date	Apr 27, 2022
Publication Date	2022
Deposit Date	Apr 3, 2024
Publisher	Institute of Electrical and Electronics Engineers
Series ISSN	2379-190X
Book Title	ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
DOI	https://doi.org/10.1109/icassp43922.2022.9746855
Public URL	http://researchrepository.napier.ac.uk/Output/3585830

Downloadable Citations

HTML

BIB

RTF