Skip to main content

Research Repository

Advanced Search

Self-attention is What You Need to Fool a Speaker Recognition System

Wang, Fangwei; Song, Ruixin; Tan, Zhiyuan; Li, Qingru; Wang, Changguang; Yang, Yong

Authors

Fangwei Wang

Ruixin Song

Qingru Li

Changguang Wang

Yong Yang



Abstract

Speaker Recognition Systems (SRSs) are becoming increasingly popular in various aspects of life due to advances in technology. However, these systems are vulnerable to cyber threats, particularly adversarial attacks. Traditional adversarial attack methods, such as the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD), are designed for a white-box setting where attackers have complete knowledge of the inner workings of the target systems. This limits the practicality of these attacks. To overcome this limitation, we propose a new attack model that uses a neural network to generate adversarial examples directly, without the need for full knowledge of the recognition model in a target SRS. In addition, we have designed a novel loss function to balance the effectiveness and confidentiality of adversarial examples. Our new approach was evaluated against SincNet, a state-of-the-art SRS. Experimental results show that our approach achieves outstanding performance, with the best attack success rate of 99.83% and the best Signal-to-Noise Ratio (SNR) value of 41.30.

Citation

Wang, F., Song, R., Tan, Z., Li, Q., Wang, C., & Yang, Y. (in press). Self-attention is What You Need to Fool a Speaker Recognition System.

Conference Name The 22nd IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom-2023)
Conference Location Exeter, UK
Start Date Nov 1, 2023
End Date Nov 3, 2023
Acceptance Date Sep 8, 2023
Deposit Date Oct 2, 2023
Publisher Institute of Electrical and Electronics Engineers
Keywords speaker recognition systems; adversarial attack; adversarial example; information security
Related Public URLs https://hpcn.exeter.ac.uk/trustcom2023/