Yuru Jiang
The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues
Jiang, Yuru; Xu, Yang; Zhan, Yuhang; He, Weikai; Wang, Yilin; Xi, Zixuan; Wang, Meiyun; Li, Xinyu; Li, Yu; Yu, Yanchao
Authors
Yang Xu
Yuhang Zhan
Weikai He
Yilin Wang
Zixuan Xi
Meiyun Wang
Xinyu Li
Yu Li
Dr Yanchao Yu Y.Yu@napier.ac.uk
Lecturer
Abstract
We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with complex family-based human daily spoken conversations in Chinese. First, we introduced human annotation scheme for both global Character relationship map and character reference relationship. And then we generated the dialogue-based character relationship triples. The corpus annotates relationships between 140 entities in total. We also carried out a data exploration experiment by deploying a BERT-based model to extract character relationships on the CRECIL corpus and another existing relation extraction corpus (DialogRE (CITATION)).The results demonstrate that extracting character relationships is more challenging in CRECIL than in DialogRE.
Citation
Jiang, Y., Xu, Y., Zhan, Y., He, W., Wang, Y., Xi, Z., Wang, M., Li, X., Li, Y., & Yu, Y. (2022, June). The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues. Presented at Thirteenth Language Resources and Evaluation Conference, Marseille, France
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | Thirteenth Language Resources and Evaluation Conference |
Start Date | Jun 20, 2022 |
End Date | Jun 25, 2022 |
Publication Date | 2022 |
Deposit Date | Jun 27, 2023 |
Publicly Available Date | Jun 27, 2023 |
Pages | 2337-2344 |
Book Title | Proceedings of the Thirteenth Language Resources and Evaluation Conference |
Publisher URL | https://aclanthology.org/2022.lrec-1.250 |
Files
The CRECIL Corpus: A New Dataset For Extraction Of Relations Between Characters In Chinese Multi-party Dialogues
(1.2 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
How Much do Robots Understand Rudeness? Challenges in Human-Robot Interaction
(2024)
Presentation / Conference Contribution
TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction
(2023)
Presentation / Conference Contribution
MoDEsT: a Modular Dialogue Experiments and Demonstration Toolkit
(2023)
Presentation / Conference Contribution
A Visually-Aware Conversational Robot Receptionist
(2022)
Presentation / Conference Contribution
Combining Visual and Social Dialogue for Human-Robot Interaction
(2021)
Presentation / Conference Contribution