Erfan Loweimi
Exploring the Use of Group Delay for Generalised VTS Based Noise Compensation
Loweimi, Erfan; Barker, Jon; Hain, Thomas
Authors
Jon Barker
Thomas Hain
Abstract
In earlier work we studied the effect of statistical normalisation for phase-based features and observed it leads to a significant robustness improvement. This paper explores the extension of the generalised Vector Taylor Series (gVTS) noise compensation approach to the group delay (GD) domain. We discuss the problems it presents, propose some solutions and derive the corresponding formulae. Furthermore, the effects of additive and channel noise in the GD domain were studied. It was observed that the GD of the noisy observation is a convex combination of the GDs of the clean signal and the additive noise and also in the expected sense, channel GD tends to zero. Experiments on Aurora-4 showed that, despite training only on the clean speech, the proposed features provide average WER reductions of 0.8% absolute and 4.1% relative compared to an MFCC-based system trained on the multi-style data. Combining the gVTS with a bottleneck DNN-based system led to average absolute (relative) WER improvements of 6.0% (23.5%) when training on clean data and 2.5% (13.8%) when using multi-style training with additive noise.
Presentation Conference Type | Conference Paper (Published) |
---|---|
Conference Name | ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Start Date | Apr 15, 2018 |
End Date | Apr 20, 2018 |
Online Publication Date | Sep 13, 2018 |
Publication Date | 2018 |
Deposit Date | Apr 4, 2024 |
Publisher | Institute of Electrical and Electronics Engineers |
Series ISSN | 2379-190X |
Book Title | 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
DOI | https://doi.org/10.1109/icassp.2018.8462595 |
Public URL | http://researchrepository.napier.ac.uk/Output/3586515 |
You might also like
Phonetic Error Analysis Beyond Phone Error Rate
(2023)
Journal Article
Multi-Stream Acoustic Modelling Using Raw Real and Imaginary Parts of the Fourier Transform
(2023)
Journal Article
Acoustic Modelling From Raw Source and Filter Components for Dysarthric Speech Recognition
(2022)
Journal Article
Dysarthric Speech Recognition, Detection and Classification using Raw Phase and Magnitude Spectra
(2023)
Presentation / Conference Contribution
Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs
(2022)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search