Khyathi Raghavi Chandu
LOWRECORP: the Low-Resource NLG Corpus Building Challenge
Chandu, Khyathi Raghavi; Howcroft, David; Gkatzia, Dimitra; Chung, Yi-Ling; Hou, Yufang; Emezue, Chris; Rajpoot, Pawan; Adewumi, Tosin
Authors
Dr. Dave Howcroft D.Howcroft@napier.ac.uk
Associate
Dr Dimitra Gkatzia D.Gkatzia@napier.ac.uk
Associate Professor
Yi-Ling Chung
Yufang Hou
Chris Emezue
Pawan Rajpoot
Tosin Adewumi
Abstract
Most languages in the world do not have sufficient data available to develop neural-network-based natural language generation (NLG) systems. To alleviate this resource scarcity, we propose a novel challenge for the NLG community: low-resource language corpus development (LOWRECORP). We present an innovative framework to collect a single dataset with dual tasks to maximize the efficiency of data collection efforts and respect language consultant time. Specifically, we focus on a text-chat-based interface for two generation tasks-conversational response generation grounded in a source document and/or image and dialogue summarization (from the former task). The goal of this shared task is to collectively develop grounded datasets for local and low-resourced languages. To enable data collection, we make available web-based software that can be used to collect these grounded conversations and summaries. Submissions will be assessed for the size, complexity, and diversity of the corpora to ensure quality control of the datasets as well as any enhancements to the interface or novel approaches to grounding conversations.
Citation
Chandu, K. R., Howcroft, D., Gkatzia, D., Chung, Y.-L., Hou, Y., Emezue, C., Rajpoot, P., & Adewumi, T. (2023, September). LOWRECORP: the Low-Resource NLG Corpus Building Challenge. Presented at 16th International Natural Language Generation Conference, Prague, Czechia
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 16th International Natural Language Generation Conference |
Start Date | Sep 13, 2023 |
End Date | Sep 15, 2023 |
Acceptance Date | Jul 12, 2023 |
Online Publication Date | Sep 11, 2023 |
Publication Date | 2023 |
Deposit Date | Nov 15, 2023 |
Publicly Available Date | Nov 15, 2023 |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 1-9 |
Book Title | The 16th International Natural Language Generation Conference: Generation Challenges |
ISBN | 979-8-89176-003-5 |
Public URL | http://researchrepository.napier.ac.uk/Output/3385919 |
Publisher URL | https://aclanthology.org/2023.inlg-genchal.1/ |
Files
LowReCorp: the Low-Resource NLG Corpus Building Challenge
(1.9 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
How speakers adapt object descriptions to listeners under load
(2019)
Journal Article
OTTers: One-turn Topic Transitions for Open-Domain Dialogue
(-0001)
Presentation / Conference Contribution
Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus
(-0001)
Presentation / Conference Contribution
G-TUNA: a corpus of referring expressions in German, including duration information
(-0001)
Presentation / Conference Contribution
What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think
(-0001)
Presentation / Conference Contribution
Downloadable Citations
About Edinburgh Napier Research Repository
Administrator e-mail: repository@napier.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search