Trainable Dynamic Subsampling for End-to-End Speech Recognition
(2019)
Presentation / Conference Contribution
Zhang, S., Loweimi, E., Xu, Y., Bell, P., & Renals, S. (2019). Trainable Dynamic Subsampling for End-to-End Speech Recognition. In Proc. Interspeech 2019 (1413-1417). https://doi.org/10.21437/interspeech.2019-2778
Jointly optimised attention-based encoder-decoder models have yielded impressive speech recognition results. The recurrent neural network (RNN) encoder is a key component in such models — it learns the hidden representations of the inputs. However, i... Read More about Trainable Dynamic Subsampling for End-to-End Speech Recognition.