Skip to main content

Research Repository

Advanced Search

Automated Human-Readable Label Generation in Open Intent Discovery

Anderson, Grant; Hart, Emma; Gkatzia, Dimitra; Beaver, Ian

Authors

Ian Beaver



Abstract

The correct determination of user intent is key in dialog systems. However, an intent classifier often requires a large, labelled training dataset to identify a set of known intents. The creation of such a dataset is a complex and time-consuming task which usually involves humans applying clustering tools to unlabelled data, analysing the results, and creating human-readable labels for each cluster. While many Open Intent Discovery works tackle the problem of discovering clusters of common intent, few generate a human-readable label that can be used to make decisions in downstream systems. To address this, we introduce a novel candidate label extraction method then evaluate six combinations of candidate extraction and label selection methods on three datasets. We find that our extraction method produces more detailed labels than the alternatives and that high quality intent labels can be generated from unlabelled data without resorting to applying costly pre-trained language models.

Citation

Anderson, G., Hart, E., Gkatzia, D., & Beaver, I. (2024, September). Automated Human-Readable Label Generation in Open Intent Discovery. Presented at Interspeech 2024, Kos, Greece

Presentation Conference Type Conference Paper (published)
Conference Name Interspeech 2024
Start Date Sep 1, 2024
End Date Sep 5, 2024
Acceptance Date Jun 4, 2024
Deposit Date Jun 17, 2024
Peer Reviewed Peer Reviewed
Keywords Index Terms: open intent discovery; label generation; plm prompting