Zhang, S., Loweimi, E., Bell, P., & Renals, S. (2021). Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models. In Proc. Interspeech 2021 (2541-2545). https://doi.org/10.21437/interspeech.2021-280