Skip to main content

Research Repository

Advanced Search

Outputs (9)

Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic) (2023)
Conference Proceeding
Howcroft, D. M., Lamb, W., Groundwater, A., & Gkatzia, D. (2023). Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic). In Proceedings of the 16th International Natural Language Generation Conference (443-448)

Gàidhlig (Scottish Gaelic; gd) is spoken by about 57k people in Scotland, but remains an under-resourced language with respect to natural language processing in general and natural language generation (NLG) in particular. To address this gap, we deve... Read More about Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic).

enunlg: a Python library for reproducible neural data-to-text experimentation (2023)
Conference Proceeding
Howcroft, D. M., & Gkatzia, D. (2023). enunlg: a Python library for reproducible neural data-to-text experimentation. In Proceedings of the 16th International Natural Language Generation Conference: System Demonstrations (4-5)

Over the past decade, a variety of neural ar-chitectures for data-to-text generation (NLG) have been proposed. However, each system typically has its own approach to pre-and post-processing and other implementation details. Diversity in implementatio... Read More about enunlg: a Python library for reproducible neural data-to-text experimentation.

LOWRECORP: the Low-Resource NLG Corpus Building Challenge (2023)
Conference Proceeding
Chandu, K. R., Howcroft, D., Gkatzia, D., Chung, Y., Hou, Y., Emezue, C., …Adewumi, T. (2023). LOWRECORP: the Low-Resource NLG Corpus Building Challenge. In The 16th International Natural Language Generation Conference: Generation Challenges (1-9)

Most languages in the world do not have sufficient data available to develop neural-network-based natural language generation (NLG) systems. To alleviate this resource scarcity, we propose a novel challenge for the NLG community: low-resource languag... Read More about LOWRECORP: the Low-Resource NLG Corpus Building Challenge.

Most NLG is Low-Resource: here's what we can do about it (2022)
Conference Proceeding
Howcroft, D. M., & Gkatzia, D. (2022). Most NLG is Low-Resource: here's what we can do about it. In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) (336-350)

Many domains and tasks in natural language generation (NLG) are inherently 'low-resource', where training data, tools and linguistic analyses are scarce. This poses a particular challenge to researchers and system developers in the era of machine-lea... Read More about Most NLG is Low-Resource: here's what we can do about it.

What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think (2021)
Conference Proceeding
Howcroft, D. M., & Rieser, V. (2021). What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (8932-8939)

Previous work has shown that human evaluations in NLP are notoriously under-powered. Here, we argue that there are two common factors which make this problem even worse: NLP studies usually (a) treat ordinal data as interval data and (b) operate unde... Read More about What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think.

OTTers: One-turn Topic Transitions for Open-Domain Dialogue (2021)
Conference Proceeding
Sevegnani, K., Howcroft, D. M., Konstas, I., & Rieser, V. (2021). OTTers: One-turn Topic Transitions for Open-Domain Dialogue. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2492-2504). https://doi.org/10.18653/v1/2021.acl-long.194

Mixed initiative in open-domain dialogue requires a system to pro-actively introduce new topics. The one-turn topic transition task explores how a system connects two topics in a cooperative and coherent manner. The goal of the task is to generate a... Read More about OTTers: One-turn Topic Transitions for Open-Domain Dialogue.

How speakers adapt object descriptions to listeners under load (2019)
Journal Article
Vogels, J., Howcroft, D. M., Tourtouri, E., & Demberg, V. (2020). How speakers adapt object descriptions to listeners under load. Language, Cognition and Neuroscience, 35(1), 78-92. https://doi.org/10.1080/23273798.2019.1648839

A controversial issue in psycholinguistics is the degree to which speakers employ audience design during language production. Hypothesising that a consideration of the listener’s needs is particularly relevant when the listener is under cognitive loa... Read More about How speakers adapt object descriptions to listeners under load.

G-TUNA: a corpus of referring expressions in German, including duration information (2017)
Conference Proceeding
Howcroft, D., Vogels, J., & Demberg, V. (2017). G-TUNA: a corpus of referring expressions in German, including duration information. In Proceedings of the 10th International Conference on Natural Language Generation (149-153). https://doi.org/10.18653/v1/w17-3522

Corpora of referring expressions elicited from human participants in a controlled environment are an important resource for research on automatic referring expression generation. We here present G-TUNA, a new corpus of referring expressions for Germa... Read More about G-TUNA: a corpus of referring expressions in German, including duration information.

Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus (2015)
Conference Proceeding
White, M., & Howcroft, D. M. (2015). Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus. In Proceedings of the 15th European Workshop on Natural Language Generation (ENLG) (28-37). https://doi.org/10.18653/v1/w15-4704

We describe an algorithm for inducing clause-combining rules for use in a traditional natural language generation architecture. An experiment pairing lexicalized text plans from the SPaRKy Restaurant Corpus with logical forms obtained by parsing the... Read More about Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus.