Skip to main content

Research Repository

Advanced Search

Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation

Watson, Lewis N.; Gkatzia, Dimitra

Authors



Contributors

Simone Balloccu
Editor

Anya Belz
Editor

Rudali Huidrom
Editor

Ehud Reiter
Editor

João Sedoc
Editor

Craig Thomson
Editor

Abstract

Reproducibility is a cornerstone of scientific research, ensuring the reliability and generalisability of findings. The ReproNLP Shared Task on Reproducibility of Evaluations in NLP aims to assess the reproducibility of human evaluation studies. This paper presents a reproduction study of the human evaluation experiment in "Hierarchical Sketch Induction for Paraphrase Generation" by Hosking et al. (2022). The original study employed a human evaluation on Amazon Mechanical Turk, assessing the quality of paraphrases generated by their proposed model using three criteria: meaning preservation, fluency, and dissimilarity. In our reproduction study, we focus on the meaning preservation criterion and utilise the Prolific platform for participant recruitment, following the ReproNLP challenge’s common approach to reproduction. We discuss the methodology, results, and implications of our reproduction study, comparing them to the original findings. Our findings contribute to the understanding of reproducibility in NLP research and highlights the potential impact of platform changes and evaluation criteria on the reproducibility of human evaluation studies.

Citation

Watson, L. N., & Gkatzia, D. (2024, May). Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation. Presented at HumEval2024 at LREC-COLING 2024, Turin, Italy

Presentation Conference Type Conference Paper (published)
Conference Name HumEval2024 at LREC-COLING 2024
Start Date May 21, 2024
End Date May 21, 2024
Acceptance Date Apr 9, 2024
Online Publication Date May 24, 2024
Publication Date 2024
Deposit Date Apr 11, 2024
Publicly Available Date May 27, 2024
Publisher European Language Resources Association (ELRA)
Peer Reviewed Peer Reviewed
Pages 221-228
Series ISSN 2951-2093
Book Title The Fourth Workshop on Human Evaluation of NLP Systems (HumEval 2024) Workshop Proceedings
ISBN 9782493814418
Keywords reproducibility, NLG, paraphrase generation, human evaluation
Public URL http://researchrepository.napier.ac.uk/Output/3590573
Publisher URL https://aclanthology.org/volumes/2024.humeval-1/
Related Public URLs https://humeval.github.io/

Files

Reproducing Human Evaluation of Meaning Preservation in Paraphrase Generation (318 Kb)
PDF

Publisher Licence URL
http://creativecommons.org/licenses/by-nc/4.0/

Copyright Statement
Copyright ELRA Language Resources Association (ELRA), 2024

These proceedings are licensed under a Creative Commons
Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0)





You might also like



Downloadable Citations