Skip to main content

Research Repository

Advanced Search

LOWRECORP: the Low-Resource NLG Corpus Building Challenge

Chandu, Khyathi Raghavi; Howcroft, David; Gkatzia, Dimitra; Chung, Yi-Ling; Hou, Yufang; Emezue, Chris; Rajpoot, Pawan; Adewumi, Tosin

Authors

Khyathi Raghavi Chandu

Yi-Ling Chung

Yufang Hou

Chris Emezue

Pawan Rajpoot

Tosin Adewumi



Abstract

Most languages in the world do not have sufficient data available to develop neural-network-based natural language generation (NLG) systems. To alleviate this resource scarcity, we propose a novel challenge for the NLG community: low-resource language corpus development (LOWRECORP). We present an innovative framework to collect a single dataset with dual tasks to maximize the efficiency of data collection efforts and respect language consultant time. Specifically, we focus on a text-chat-based interface for two generation tasks-conversational response generation grounded in a source document and/or image and dialogue summarization (from the former task). The goal of this shared task is to collectively develop grounded datasets for local and low-resourced languages. To enable data collection, we make available web-based software that can be used to collect these grounded conversations and summaries. Submissions will be assessed for the size, complexity, and diversity of the corpora to ensure quality control of the datasets as well as any enhancements to the interface or novel approaches to grounding conversations.

Citation

Chandu, K. R., Howcroft, D., Gkatzia, D., Chung, Y.-L., Hou, Y., Emezue, C., Rajpoot, P., & Adewumi, T. (2023, September). LOWRECORP: the Low-Resource NLG Corpus Building Challenge. Presented at 16th International Natural Language Generation Conference, Prague, Czechia

Presentation Conference Type Conference Paper (published)
Conference Name 16th International Natural Language Generation Conference
Start Date Sep 13, 2023
End Date Sep 15, 2023
Acceptance Date Jul 12, 2023
Online Publication Date Sep 11, 2023
Publication Date 2023
Deposit Date Nov 15, 2023
Publicly Available Date Nov 15, 2023
Publisher Association for Computational Linguistics (ACL)
Pages 1-9
Book Title The 16th International Natural Language Generation Conference: Generation Challenges
ISBN 979-8-89176-003-5
Public URL http://researchrepository.napier.ac.uk/Output/3385919
Publisher URL https://aclanthology.org/2023.inlg-genchal.1/

Files





You might also like



Downloadable Citations