Automatic Human Utility Evaluation of ASR Systems: Does WER Really Predict Performance?
Presentation / Conference Contribution
Favre, B., Cheung, K., Kazemian, S., Lee, A., Liu, Y., Munteanu, C., Nenkova, A., Ochei, D., Penn, G., Tratz, S., Voss, C., & Zeller, F. (2013, August). Automatic Human Utility Evaluation of ASR Systems: Does WER Really Predict Performance?. Presented at Interspeech 2013, Lyon, France
We propose an alternative evaluation metric to Word Error Rate (WER) for the decision audit task of meeting recordings, which exemplifies how to evaluate speech recognition within a legitimate application context. Using machine learning on an initial... Read More about Automatic Human Utility Evaluation of ASR Systems: Does WER Really Predict Performance?.