Evaluating Language Model Vulnerability to Poisoning Attacks in Low-Resource Settings

Plant, Richard; Giuffrida, Mario Valerio; Pitropakis, Nikolaos; Gkatzia, Dimitra

doi:10.1109/taslp.2024.3507565

Evaluating Language Model Vulnerability to Poisoning Attacks in Low-Resource Settings

Plant, Richard; Giuffrida, Mario Valerio; Pitropakis, Nikolaos; Gkatzia, Dimitra

Authors

Richard Plant R.Plant@napier.ac.uk
Research Student

Mario Valerio Giuffrida

Dr Nick Pitropakis N.Pitropakis@napier.ac.uk
Associate Professor

Dr Dimitra Gkatzia D.Gkatzia@napier.ac.uk
Associate Professor

Abstract

Pre-trained language models are a highly effective source of knowledge transfer for natural language processing tasks, as their development represents an investment of resources beyond the reach of most researchers and end users. The widespread availability of such easily adaptable resources has enabled high levels of performance, which is especially valuable for low-resource language users who have typically been overlooked when it comes to NLP applications. However, these models introduce vulnerabilities in NLP toolchains, since they may prove vulnerable to attacks from malicious actors with access to the data used for downstream training. By perturbing instances from the training set, such attacks seek to undermine model capabilities and produce radically different outcomes during inference. We show that adversarial data manipulation has a severe effect on model performance, with BERT's performance dropping by more than 30% on average across all tasks at a poisoning ratio greater than 50%. Additionally, we conduct the first evaluation of this kind in the Basque language domain, establishing the vulnerability of low-resource models to the same form of attack.

Citation

Plant, R., Giuffrida, M. V., Pitropakis, N., & Gkatzia, D. (2024). Evaluating Language Model Vulnerability to Poisoning Attacks in Low-Resource Settings. IEEE/ACM Transactions on Audio, Speech and Language Processing, 33, 54-67. https://doi.org/10.1109/taslp.2024.3507565

Journal Article Type	Article
Acceptance Date	Nov 21, 2024
Online Publication Date	Nov 28, 2024
Publication Date	2024
Deposit Date	Feb 7, 2025
Publicly Available Date	Feb 7, 2025
Journal	IEEE/ACM Transactions on Audio, Speech, and Language Processing
Print ISSN	2329-9290
Electronic ISSN	2329-9304
Publisher	Institute of Electrical and Electronics Engineers
Peer Reviewed	Peer Reviewed
Volume	33
Pages	54-67
DOI	https://doi.org/10.1109/taslp.2024.3507565
Keywords	Language modelling, machine learning methods for hlt, language understanding and computational semantics
Public URL	http://researchrepository.napier.ac.uk/Output/3971132