The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues

Jiang, Yuru; Xu, Yang; Zhan, Yuhang; He, Weikai; Wang, Yilin; Xi, Zixuan; Wang, Meiyun; Li, Xinyu; Li, Yu; Yu, Yanchao

The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues

Jiang, Yuru; Xu, Yang; Zhan, Yuhang; He, Weikai; Wang, Yilin; Xi, Zixuan; Wang, Meiyun; Li, Xinyu; Li, Yu; Yu, Yanchao

Authors

Yuru Jiang

Yang Xu

Yuhang Zhan

Weikai He

Yilin Wang

Zixuan Xi

Meiyun Wang

Xinyu Li

Yu Li

Dr Yanchao Yu Y.Yu@napier.ac.uk
Lecturer

Abstract

We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with complex family-based human daily spoken conversations in Chinese. First, we introduced human annotation scheme for both global Character relationship map and character reference relationship. And then we generated the dialogue-based character relationship triples. The corpus annotates relationships between 140 entities in total. We also carried out a data exploration experiment by deploying a BERT-based model to extract character relationships on the CRECIL corpus and another existing relation extraction corpus (DialogRE (CITATION)).The results demonstrate that extracting character relationships is more challenging in CRECIL than in DialogRE.

Citation

Jiang, Y., Xu, Y., Zhan, Y., He, W., Wang, Y., Xi, Z., Wang, M., Li, X., Li, Y., & Yu, Y. (2022, June). The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues. Presented at Thirteenth Language Resources and Evaluation Conference, Marseille, France

Presentation Conference Type	Conference Paper (published)
Conference Name	Thirteenth Language Resources and Evaluation Conference
Start Date	Jun 20, 2022
End Date	Jun 25, 2022
Publication Date	2022
Deposit Date	Jun 27, 2023
Publicly Available Date	Jun 27, 2023
Pages	2337-2344
Book Title	Proceedings of the Thirteenth Language Resources and Evaluation Conference
Publisher URL	https://aclanthology.org/2022.lrec-1.250

Files

The CRECIL Corpus: A New Dataset For Extraction Of Relations Between Characters In Chinese Multi-party Dialogues (1.2 Mb)
PDF

Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/

How Much do Robots Understand Rudeness? Challenges in Human-Robot Interaction (2024)
Presentation / Conference Contribution

TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction (2023)
Presentation / Conference Contribution

MoDEsT: a Modular Dialogue Experiments and Demonstration Toolkit (2023)
Presentation / Conference Contribution

A Visually-Aware Conversational Robot Receptionist (2022)
Presentation / Conference Contribution

Combining Visual and Social Dialogue for Human-Robot Interaction (2021)
Presentation / Conference Contribution

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

Files

You might also like

Downloadable Citations