Yuru Jiang
The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues
Jiang, Yuru; Xu, Yang; Zhan, Yuhang; He, Weikai; Wang, Yilin; Xi, Zixuan; Wang, Meiyun; Li, Xinyu; Li, Yu; Yu, Yanchao
Authors
Yang Xu
Yuhang Zhan
Weikai He
Yilin Wang
Zixuan Xi
Meiyun Wang
Xinyu Li
Yu Li
Dr Yanchao Yu Y.Yu@napier.ac.uk
Lecturer
Abstract
We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with complex family-based human daily spoken conversations in Chinese. First, we introduced human annotation scheme for both global Character relationship map and character reference relationship. And then we generated the dialogue-based character relationship triples. The corpus annotates relationships between 140 entities in total. We also carried out a data exploration experiment by deploying a BERT-based model to extract character relationships on the CRECIL corpus and another existing relation extraction corpus (DialogRE (CITATION)).The results demonstrate that extracting character relationships is more challenging in CRECIL than in DialogRE.
Citation
Jiang, Y., Xu, Y., Zhan, Y., He, W., Wang, Y., Xi, Z., Wang, M., Li, X., Li, Y., & Yu, Y. (2022, June). The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues. Presented at Thirteenth Language Resources and Evaluation Conference, Marseille, France
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | Thirteenth Language Resources and Evaluation Conference |
Start Date | Jun 20, 2022 |
End Date | Jun 25, 2022 |
Publication Date | 2022 |
Deposit Date | Jun 27, 2023 |
Publicly Available Date | Jun 27, 2023 |
Pages | 2337-2344 |
Book Title | Proceedings of the Thirteenth Language Resources and Evaluation Conference |
Publisher URL | https://aclanthology.org/2022.lrec-1.250 |
Files
The CRECIL Corpus: A New Dataset For Extraction Of Relations Between Characters In Chinese Multi-party Dialogues
(1.2 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
An Incremental Dialogue System for Learning Visually Grounded Word Meanings (demonstration system)
(2018)
Presentation / Conference Contribution
Information density and overlap in spoken dialogue
(2015)
Journal Article
An ensemble model with ranking for social dialogue
(2017)
Presentation / Conference Contribution
Explainable Representations of the Social State: A Model for Social Human-Robot Interactions
(-0001)
Preprint / Working Paper
The PARLANCE mobile application for interactive search in English and Mandarin
(2014)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search