Skip to main content

Research Repository

Advanced Search

Outputs (4)

Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic) (2023)
Presentation / Conference Contribution
Howcroft, D. M., Lamb, W., Groundwater, A., & Gkatzia, D. (2023, September). Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic). Presented at The 16th International Natural Language Generation Conference

Gàidhlig (Scottish Gaelic; gd) is spoken by about 57k people in Scotland, but remains an under-resourced language with respect to natural language processing in general and natural language generation (NLG) in particular. To address this gap, we deve... Read More about Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic).

enunlg: a Python library for reproducible neural data-to-text experimentation (2023)
Presentation / Conference Contribution
Howcroft, D. M., & Gkatzia, D. (2023). enunlg: a Python library for reproducible neural data-to-text experimentation. In Proceedings of the 16th International Natural Language Generation Conference: System Demonstrations (4-5)

Over the past decade, a variety of neural ar-chitectures for data-to-text generation (NLG) have been proposed. However, each system typically has its own approach to pre-and post-processing and other implementation details. Diversity in implementatio... Read More about enunlg: a Python library for reproducible neural data-to-text experimentation.

Most NLG is Low-Resource: here's what we can do about it (2022)
Presentation / Conference Contribution
Howcroft, D. M., & Gkatzia, D. (2022). Most NLG is Low-Resource: here's what we can do about it. In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM) (336-350)

Many domains and tasks in natural language generation (NLG) are inherently 'low-resource', where training data, tools and linguistic analyses are scarce. This poses a particular challenge to researchers and system developers in the era of machine-lea... Read More about Most NLG is Low-Resource: here's what we can do about it.

What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think (2021)
Presentation / Conference Contribution
Howcroft, D. M., & Rieser, V. (2021, November). What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think. Presented at 2021 Conference on Empirical Methods in Natural Language Processing

Previous work has shown that human evaluations in NLP are notoriously under-powered. Here, we argue that there are two common factors which make this problem even worse: NLP studies usually (a) treat ordinal data as interval data and (b) operate unde... Read More about What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think.