Skip to main content

Research Repository

Advanced Search

Deep Pre-trained Contrastive Self-Supervised Learning: A Cyberbullying Detection Approach with Augmented Datasets

Alharigy, Lulwah Muhammad; Al-Nuaim, Hana Abdullah; Moradpoor, Naghmeh

Authors

Hana Abdullah Al-Nuaim



Abstract

Cyberbullying is a widespread problem that has only increased in recent years due to the massive dependence on social media. Although, there are many approaches for detecting cyberbullying they still need to be improved upon for more accurate detection. We need new approaches that understand the context of the words used in cyberbullying by generating different representations of each word. In addition. there is a large amount of unlabelled data on the Internet that needs to be labelled for a more accurate detection process. Even though multiple methods for annotating datasets exists, the most widely used are still manual approaches, either using experts or crowdsourcing. However, The time needed and high cost of labor for manually annotation approaches result in a lack of annotated social network datasets for training a robust cyberbullying detector. Automated approaches can be relied upon in labelling data, such as using the Self-Supervised Learning (SSL) model. In this paper, we proposed two main parts. The first part is proposing a model of parallel BERT + Bi-LSTM used for detecting cyberbullying terms. The second part is utilizing Contrastive Self-Supervised Learning (a form of SSL) to augment the training set from unlabeled data using a small portion of another manually annotated dataset. Our proposed model that used deep pre-trained contrastive self-supervised learning for detecting cyberbullying using augmented datasets achieved a performance of (0.9311) using macro average F1 score. This result shows our model outperformed the baseline models - the top three teams in the competition SemEval-2020 Task 12.

Presentation Conference Type Conference Paper (Published)
Conference Name 2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)
Start Date Dec 4, 2022
End Date Dec 6, 2022
Acceptance Date Sep 24, 2022
Online Publication Date Jan 13, 2023
Publication Date 2022
Deposit Date Sep 27, 2022
Publisher Institute of Electrical and Electronics Engineers
Series ISSN 2472-7555
Book Title 2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)
DOI https://doi.org/10.1109/CICN56167.2022.10008274
Keywords Cyberbullying Detection, Self-Supervised Learning, Contrastive Self-Supervised Learning, Cosine Similarity
Public URL http://researchrepository.napier.ac.uk/Output/2924678