Emotional vocal expressions recognition using the COST 2102 Italian database of emotional speech

Atassi, Hicham; Riviello, Maria Teresa; Sm�kal, Zden?k; Hussain, Amir; Esposito, Anna

doi:10.1007/978-3-642-12397-9_21

Emotional vocal expressions recognition using the COST 2102 Italian database of emotional speech

Atassi, Hicham; Riviello, Maria Teresa; Sm�kal, Zden?k; Hussain, Amir; Esposito, Anna

Authors

Hicham Atassi

Maria Teresa Riviello

Zden?k Sm�kal

Prof Amir Hussain A.Hussain@napier.ac.uk
Professor

Anna Esposito

Abstract

The present paper proposes a new speaker-independent approach to the classification of emotional vocal expressions by using the COST 2102 Italian database of emotional speech. The audio records extracted from video clips of Italian movies possess a certain degree of spontaneity and are either noisy or slightly degraded by an interruption making the collected stimuli more realistic in comparison with available emotional databases containing utterances recorded under studio conditions. The audio stimuli represent 6 basic emotional states: happiness, sarcasm/irony, fear, anger, surprise, and sadness. For these more realistic conditions, and using a speaker independent approach, the proposed system is able to classify the emotions under examination with 60.7% accuracy by using a hierarchical structure consisting of a Perceptron and fifteen Gaussian Mixture Models (GMM) trained to distinguish within each pair (couple) of emotions under examination. The best features in terms of high discriminative power were selected by using the Sequential Floating Forward Selection (SFFS) algorithm among a large number of spectral, prosodic and voice quality features. The results were compared with the subjective evaluation of the stimuli provided by human subjects.

Presentation Conference Type	Conference Paper (Published)
Conference Name	Second COST 2102 International Training School
Start Date	Mar 23, 2009
End Date	Mar 27, 2009
Publication Date	2010
Deposit Date	Oct 16, 2019
Volume	5967 LNCS
Pages	255-267
Series Title	Lecture Notes in Computer Science
Series Number	5967
Series ISSN	0302-9743
Book Title	Development of Multimodal Interfaces: Active Listening and Synchrony Second COST 2102 International Training School, Dublin, Ireland, March 23-27, 2009, Revised Selected Papers:
ISBN	978-3-642-12396-2
DOI	https://doi.org/10.1007/978-3-642-12397-9_21
Keywords	Emotion recognition, speech, Italian database, spectral features, high level features
Public URL	http://researchrepository.napier.ac.uk/Output/1793427

Applications of Deep Learning and Reinforcement Learning to Biological Data (2018)
Journal Article

Guided Policy Search for Sequential Multitask Learning (2018)
Journal Article

Learning Latent Features With Infinite Nonnegative Binary Matrix Trifactorization (2018)
Journal Article

Cross-modality interactive attention network for multispectral pedestrian detection (2018)
Journal Article

Weakly Supervised Segmentation of SAR Imagery Using Superpixel and Hierarchically Adversarial CRF (2019)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

You might also like

Downloadable Citations