Skip to main content

Research Repository

Advanced Search

All Outputs (55)

You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models (2024)
Journal Article
Plant, R., Giuffrida, V., & Gkatzia, D. (online). You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models. Computer Speech and Language, https://doi.org/10.1016/j.csl.2024.101746

The widespread use of pre-trained language models has revolutionised knowledge transfer in natural language processing tasks. However, there is a concern regarding potential breaches of user trust due to the risk of re-identification attacks, where m... Read More about You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models.

Working with troubles and failures in conversation between humans and robots: workshop report (2023)
Journal Article
Förster, F., Romeo, M., Holthaus, P., Wood, L. J., Dondrup, C., Fischer, J. E., Liza, F. F., Kaszuba, S., Hough, J., Nesset, B., Hernández García, D., Kontogiorgos, D., Williams, J., Özkan, E. E., Barnard, P., Berumen, G., Price, D., Cobb, S., Wiltschko, M., Tisserand, L., …Kapetanios, E. (2023). Working with troubles and failures in conversation between humans and robots: workshop report. Frontiers in Robotics and AI, 10, Article 1202306. https://doi.org/10.3389/frobt.2023.1202306

This paper summarizes the structure and findings from the first Workshop on Troubles and Failures in Conversations between Humans and Robots. The workshop was organized to bring together a small, interdisciplinary group of researchers working on misc... Read More about Working with troubles and failures in conversation between humans and robots: workshop report.

Barriers and enabling factors for error analysis in NLG research (2023)
Journal Article
Van Miltenburg, E., Clinciu, M., Dušek, O., Gkatzia, D., Inglis, S., Leppänen, L., Mahamood, S., Schoch, S., Thomson, C., & Wen, L. (2023). Barriers and enabling factors for error analysis in NLG research. Northern European Journal of Language Technology, 9(1), https://doi.org/10.3384/nejlt.2000-1533.2023.4529

Earlier research has shown that few studies in Natural Language Generation (NLG) evaluate their system outputs using an error analysis, despite known limitations of automatic evaluation metrics and human ratings. This position paper takes the stance... Read More about Barriers and enabling factors for error analysis in NLG research.

A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue (2022)
Book Chapter
Strathearn, C., & Gkatzia, D. (2023). A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue. In M. Abbas (Ed.), Analysis and Application of Natural Language and Speech Processing (123-144). Springer. https://doi.org/10.1007/978-3-031-11035-1_6

This paper argues that future dialogue systems must retrieve relevant information from multiple structured and unstructured data sources in order to generate natural and informative responses as well as exhibit commonsense capabilities and flexibilit... Read More about A Commonsense-Enhanced Document-Grounded Conversational Agent: A Case Study on Task-Based Dialogue.

Multi3Generation: Multi-task, Multilingual, Multi-Modal Language Generation (2022)
Presentation / Conference Contribution
Barreiro, A., de Souza, J. G., Gatt, A., Bhatt, M., Lloret, E., Erdem, A., Gkatzia, D., Moniz, H., Russo, I., Kepler, F., Calixto, I., Paprzycki, M., Portet, F., Augenstein, I., & Alhasani, M. (2022, June). Multi3Generation: Multi-task, Multilingual, Multi-Modal Language Generation. Poster presented at 23rd Annual Conference of the European Association for Machine Translation (EAMT 2022), Ghent, Belgium

This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an interdisciplinary network of research groups working on different aspects of language generation. This "metapaper" will serve... Read More about Multi3Generation: Multi-task, Multilingual, Multi-Modal Language Generation.

Opportunities and risks in the use of AI in career development practice (2022)
Journal Article
Wilson, M., Robertson, P., Cruickshank, P., & Gkatzia, D. (2022). Opportunities and risks in the use of AI in career development practice. Journal of the National Institute for Career Education and Counselling, 48(1), 48-57. https://doi.org/10.20856/jnicec.4807

The Covid-19 pandemic required many aspects of life to move online. This accelerated a broader trend for increasing use of ICT and AI, with implications for both the world of work and career development. This article explores the potential benefits a... Read More about Opportunities and risks in the use of AI in career development practice.

Generating Unambiguous and Diverse Referring Expressions   (2020)
Journal Article
Panagiaris, N., Hart, E., & Gkatzia, D. (2021). Generating Unambiguous and Diverse Referring Expressions  . Computer Speech and Language, 68, Article 101184. https://doi.org/10.1016/j.csl.2020.101184

Neural Referring Expression Generation (REG) models have shown promising results in generating expressions which uniquely describe visual objects. However, current REG models still lack the ability to produce diverse and unambiguous referring express... Read More about Generating Unambiguous and Diverse Referring Expressions  .

Monitoring Users’ Behavior: Anti-Immigration Speech Detection on Twitter (2020)
Journal Article
Pitropakis, N., Kokot, K., Gkatzia, D., Ludwiniak, R., Mylonas, A., & Kandias, M. (2020). Monitoring Users’ Behavior: Anti-Immigration Speech Detection on Twitter. Machine Learning and Knowledge Extraction, 2(3), 192-215. https://doi.org/10.3390/make2030011

The proliferation of social media platforms changed the way people interact online. However, engagement with social media comes with a price, the users’ privacy. Breaches of users’ privacy, such as the Cambridge Analytica scandal, can reveal how the... Read More about Monitoring Users’ Behavior: Anti-Immigration Speech Detection on Twitter.

Data-to-Text Generation Improves Decision-Making Under Uncertainty (2017)
Journal Article
Gkatzia, D., Lemon, O., & Rieser, V. (2017). Data-to-Text Generation Improves Decision-Making Under Uncertainty. IEEE Computational Intelligence Magazine, 12(3), 10-17. https://doi.org/10.1109/MCI.2017.2708998

Decision-making is often dependent on uncertain data, e.g. data associated with confidence scores or probabilities. This article presents a comparison of different information presentations for uncertain data and, for the first time, measures their e... Read More about Data-to-Text Generation Improves Decision-Making Under Uncertainty.

The REAL corpus (2016)
Data
Bartie, P., Mackaness, W., Gkatzia, D., & Rieser, V. (2016). The REAL corpus. [Dataset]

Our interest is in people’s capacity to efficiently and effectively describe geographic objects in urban scenes. The broader ambition is to develop spatial models capable of equivalent functionality able to construct such referring expressions. To th... Read More about The REAL corpus.

Multi-adaptive Natural Language Generation using Principal Component Regression (2014)
Presentation / Conference Contribution
Gkatzia, D., Hastie, H., & Lemon, O. (2014). Multi-adaptive Natural Language Generation using Principal Component Regression. In Proceedings of the 8th International Natural Language Generation Conference (138-142)

We present FeedbackGen, a system that uses a multi-adaptive approach to Natural Language Generation. With the term 'multi-adaptive', we refer to a system that is able to adapt its content to different user groups simultaneously, in our case adapting... Read More about Multi-adaptive Natural Language Generation using Principal Component Regression.

TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction
Presentation / Conference Contribution
Strathearn, C., Yu, Y., & Gkatzia, D. (2023, March). TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction. Presented at 'HRCI23, Stockholm, Sweden

The most effective way of communication between humans and robots is through natural language communication. However, there are many challenges to overcome before robots can effectively converse in order to collaborate and work together with humans.... Read More about TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction.

Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation
Presentation / Conference Contribution
Watson, L. N., & Gkatzia, D. (2024, May). Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation. Presented at HumEval2024 at LREC-COLING 2024, Turin, Italy

Reproducibility is a cornerstone of scientific research, ensuring the reliability and generalisability of findings. The ReproNLP Shared Task on Reproducibility of Evaluations in NLP aims to assess the reproducibility of human evaluation studies. This... Read More about Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation.

Most NLG is Low-Resource: here's what we can do about it
Presentation / Conference Contribution
Howcroft, D. M., & Gkatzia, D. (2022, December). Most NLG is Low-Resource: here's what we can do about it. Presented at Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), Abu Dhabi, UAE

Many domains and tasks in natural language generation (NLG) are inherently 'low-resource', where training data, tools and linguistic analyses are scarce. This poses a particular challenge to researchers and system developers in the era of machine-lea... Read More about Most NLG is Low-Resource: here's what we can do about it.

Responsible Design & Evaluation of a Conversational Agent for a National Careers Service
Presentation / Conference Contribution
Wilson, M., Cruickshank, P., Gkatzia, D., & Robertson, P. (2023, September). Responsible Design & Evaluation of a Conversational Agent for a National Careers Service. Presented at Symposium on Future Directions in Information Access (FDIA) 2023, Vienna, Austria

This PhD project applies a research-through-design approach to the development of a conversational agent for a national career service for young people. This includes addressing practical, interactional and ethical aspects of the system. For each asp... Read More about Responsible Design & Evaluation of a Conversational Agent for a National Careers Service.

Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic)
Presentation / Conference Contribution
Howcroft, D. M., Lamb, W., Groundwater, A., & Gkatzia, D. (2023, September). Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic). Presented at The 16th International Natural Language Generation Conference

Gàidhlig (Scottish Gaelic; gd) is spoken by about 57k people in Scotland, but remains an under-resourced language with respect to natural language processing in general and natural language generation (NLG) in particular. To address this gap, we deve... Read More about Building a dual dataset of text-and image-grounded conversations and summarisation in Gàidhlig (Scottish Gaelic).