Skip to main content

Research Repository

Advanced Search

All Outputs (3)

You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models (2024)
Journal Article
Plant, R., Giuffrida, V., & Gkatzia, D. (2025). You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models. Computer Speech and Language, 90, Article 101746. https://doi.org/10.1016/j.csl.2024.101746

The widespread use of pre-trained language models has revolutionised knowledge transfer in natural language processing tasks. However, there is a concern regarding potential breaches of user trust due to the risk of re-identification attacks, where m... Read More about You Are What You Write: Author re-identification privacy attacks in the era of pre-trained language models.

CAPE: Context-Aware Private Embeddings for Private Language Learning (2021)
Presentation / Conference Contribution
Plant, R., Gkatzia, D., & Giuffrida, V. (2021, November). CAPE: Context-Aware Private Embeddings for Private Language Learning. Presented at EMNLP 2021 Conference, Punta Cana, Dominican Republic [Online]

Neural language models have contributed to state-of-the-art results in a number of downstream applications including sentiment analysis, intent classification and others. However, obtaining text representations or embeddings using these models risks... Read More about CAPE: Context-Aware Private Embeddings for Private Language Learning.

COVID-19 UK Social Media Dataset for Public Health Research (2021)
Data
Plant, R., Hussain, A., & Sheikh, A. (2021). COVID-19 UK Social Media Dataset for Public Health Research. [Data]. https://doi.org/10.17869/enu.2021.2755974

We present a benchmark database of public social media postings from the United Kingdom related to the Covid-19 pandemic for academic research purposes, along with some initial analysis, including a taxonomy of key themes organised by keyword. This r... Read More about COVID-19 UK Social Media Dataset for Public Health Research.