Soujanya Poria
Towards an intelligent framework for multimodal affective data analysis
Poria, Soujanya; Cambria, Erik; Hussain, Amir; Huang, Guang-Bin
Abstract
An increasingly large amount of multimodal content is posted on social media websites such as YouTube and Facebook everyday. In order to cope with the growth of such so much multimodal data, there is an urgent need to develop an intelligent multi-modal analysis framework that can effectively extract information from multiple modalities. In this paper, we propose a novel multimodal information extraction agent, which infers and aggregates the semantic and affective information associated with user-generated multimodal data in contexts such as e-learning, e-health, automatic video content tagging and human–computer interaction. In particular, the developed intelligent agent adopts an ensemble feature extraction approach by exploiting the joint use of tri-modal (text, audio and video) features to enhance the multimodal information extraction process. In preliminary experiments using the eNTERFACE dataset, our proposed multi-modal system is shown to achieve an accuracy of 87.95%, outperforming the best state-of-the-art system by more than 10%, or in relative terms, a 56% reduction in error rate.
Citation
Poria, S., Cambria, E., Hussain, A., & Huang, G.-B. (2015). Towards an intelligent framework for multimodal affective data analysis. Neural Networks, 63, 104-116. https://doi.org/10.1016/j.neunet.2014.10.005
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 9, 2014 |
Online Publication Date | Nov 6, 2014 |
Publication Date | 2015-03 |
Deposit Date | Sep 27, 2019 |
Journal | Neural Networks |
Print ISSN | 0893-6080 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 63 |
Pages | 104-116 |
DOI | https://doi.org/10.1016/j.neunet.2014.10.005 |
Keywords | Multimodal; Multimodal sentiment analysis; Facial expressions; Speech; Text; Emotion analysis; Affective computing |
Public URL | http://researchrepository.napier.ac.uk/Output/1792971 |
You might also like
MA-Net: Resource-efficient multi-attentional network for end-to-end speech enhancement
(2024)
Journal Article
Artificial intelligence enabled smart mask for speech recognition for future hearing devices
(2024)
Journal Article
Are Foundation Models the Next-Generation Social Media Content Moderators?
(2024)
Journal Article
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search