Combining Visual and Social Dialogue for Human-Robot Interaction

Gunson, Nancie; Hernandez Garcia, Daniel; Part, Jose L.; Yu, Yanchao; Sieińska, Weronika; Dondrup, Christian; Lemon, Oliver

doi:10.1145/3462244.3481303

Optimising strategies for learning visually grounded word meanings through interaction (2018)
Thesis
Yu, Y. (2018). Optimising strategies for learning visually grounded word meanings through interaction. (Thesis)

Language Grounding is a fundamental problem in AI, regarding how symbols in Natural Language (e.g. words and phrases) refer to aspects of the physical environment (e.g. ob jects and attributes). In this thesis, our ultimate goal is to address an inte... Read More about Optimising strategies for learning visually grounded word meanings through interaction.

An Incremental Dialogue System for Learning Visually Grounded Word Meanings (demonstration system) (2018)
Presentation / Conference Contribution
Yu, Y., Eshghi, A., & Lemon, O. (2018, June). An Incremental Dialogue System for Learning Visually Grounded Word Meanings (demonstration system). Poster presented at Workshop on Dialogue and Perception 2018, Gothenburg

An ensemble model with ranking for social dialogue (2017)
Presentation / Conference Contribution
Papaioannou, I., Curry, A. C., Part, J. L., Shalyminov, I., Xu, X., Yu, Y., …Lemon, O. (2017, December). An ensemble model with ranking for social dialogue. Paper presented at NIPS 2017 Conversational AI Workshop, Long Beach, US

Open-domain social dialogue is one of the long-standing goals of Artificial Intelligence. This year, the Amazon Alexa Prize challenge was announced for the first time, where real customers get to rate systems developed by leading universities worldwi... Read More about An ensemble model with ranking for social dialogue.

Information density and overlap in spoken dialogue (2015)
Journal Article
Dethlefs, N., Hastie, H., Cuayáhuitl, H., Yu, Y., Rieser, V., & Lemon, O. (2016). Information density and overlap in spoken dialogue. Computer Speech and Language, 37, 82-97. https://doi.org/10.1016/j.csl.2015.11.001

Incremental dialogue systems are often perceived as more responsive and natural because they are able to address phenomena of turn-taking and overlapping speech, such as backchannels or barge-ins. Previous work in this area has often identified disti... Read More about Information density and overlap in spoken dialogue.

Comparing dialogue strategies for learning grounded language from human tutors
Presentation / Conference Contribution
Yu, Y., Lemon, O., & Eshghi, A. (2016, July). Comparing dialogue strategies for learning grounded language from human tutors. Presented at 20th Workshop Series on the Semantics and Pragmatics of Dialogue 201, New Brunswick, US

We address the problem of interactively learning perceptually grounded word meanings in a multimodal dialogue system. Human tutors can correct, question, and confirm the statements of a dialogue agent which is trying to interactively learn the meanin... Read More about Comparing dialogue strategies for learning grounded language from human tutors.

How Much do Robots Understand Rudeness? Challenges in Human-Robot Interaction
Presentation / Conference Contribution
Orme, M., Yu, Y., & Tan, Z. (2024, May). How Much do Robots Understand Rudeness? Challenges in Human-Robot Interaction. Presented at The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy

This paper concerns the pressing need to understand and manage inappropriate language within the evolving human-robot interaction (HRI) landscape. As intelligent systems and robots transition from controlled laboratory settings to everyday households... Read More about How Much do Robots Understand Rudeness? Challenges in Human-Robot Interaction.

Learning how to learn: grounding word meanings through conversation with humans
Presentation / Conference Contribution
Lemon, O., Eshghi, A., & Yu, Y. (2016, October). Learning how to learn: grounding word meanings through conversation with humans. Presented at MI20-HLC, Windsor, UK

Alana: Social dialogue using an ensemble model and a ranker trained on user feedback
Presentation / Conference Contribution
Papaioannou, I., Curry, A. C., Part, J. L., Shalyminov, I., Xu, X., Yu, Y., Dušek, O., Rieser, V., & Lemon, O. (2017, December). Alana: Social dialogue using an ensemble model and a ranker trained on user feedback. Presented at Alexa Prize SocialBot Grand Challenge 1

We describe our Alexa prize system (called ‘Alana’) which consists of an ensemble of bots, combining rule-based and machine learning systems, and using a contextual ranking mechanism to choose system responses. This paper reports on the version of th... Read More about Alana: Social dialogue using an ensemble model and a ranker trained on user feedback.

An Incremental Dialogue System for Learning Visually Grounded Language (demonstration system)
Presentation / Conference Contribution
Yu, Y., Eshghi, A., & Lemon, O. (2016, July). An Incremental Dialogue System for Learning Visually Grounded Language (demonstration system). Presented at 20th Workshop Series on the Semantics and Pragmatics of Dialogue 2016, New Brunswick, US

We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic, and bi-directional grammar framework – Dynamic Syntax and Type Theory with Re... Read More about An Incremental Dialogue System for Learning Visually Grounded Language (demonstration system).

The BURCHAK corpus: A challenge data set for interactive learning of visually grounded word meanings
Presentation / Conference Contribution
Yu, Y., Eshghi, A., Mills, G., & Lemon, O. J. (2017, April). The BURCHAK corpus: A challenge data set for interactive learning of visually grounded word meanings. Presented at The Sixth Workshop on Vision and Language, Valencia, Spain

We motivate and describe a new freely available human-human dialogue data set for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-ch... Read More about The BURCHAK corpus: A challenge data set for interactive learning of visually grounded word meanings.

Learning how to learn: An adaptive dialogue agent for incrementally learning visually grounded word meanings
Presentation / Conference Contribution
Yu, Y., Eshghi, A., & Lemon, O. (2017, July). Learning how to learn: An adaptive dialogue agent for incrementally learning visually grounded word meanings. Presented at First Workshop on Language Grounding for Robotics, Vancouver, Canada

We present an optimised multi-modal dialogue agent for interactive learning of visually grounded word meanings from a human tutor, trained on real human-human tutoring data. Within a life-long interactive learning period, the agent, trained using Rei... Read More about Learning how to learn: An adaptive dialogue agent for incrementally learning visually grounded word meanings.

SpeechCity: A Conversational City Guide based on Open Data
Presentation / Conference Contribution
Rieser, V., Janarthanam, S., Taylor, A., Yu, Y., & Lemon, O. (2014, September). SpeechCity: A Conversational City Guide based on Open Data. Presented at 18th Workshop on the Semantics and Pragmatics of Dialogue, Edinburgh

Incremental Generation of Visually Grounded Language in Dialogue (demonstration system)
Presentation / Conference Contribution
Eshghi, A., Yu, Y., & Lemon, O. (2018, September). Incremental Generation of Visually Grounded Language in Dialogue (demonstration system). Presented at The 9th International Natural Language Generation conference, Edinburgh

Two Alternative Frameworks for Deploying Spoken Dialogue Systems to Mobile Platforms for Evaluation “In the Wild”
Presentation / Conference Contribution
Hastie, H., Aufaure, M.-A., Alexopoulos, P., Bouchard, H., Cuayáhuitl, H., Dethlefs, N., Gašic, M., Guimeráns, A. G., Henderson, J., Lemon, O., others, & Yu, Y. (2014, September). Two Alternative Frameworks for Deploying Spoken Dialogue Systems to Mobile Platforms for Evaluation “In the Wild”. Presented at 18th Workshop on the Semantics and Pragmatics of Dialogue, Edinburgh

We demonstrate two alternative frameworks for testing and evaluating spoken dialogue systems on mobile devices for use “in the wild”. We firstly present a spoken dialogue system that uses third party ASR (Automatic Speech Recognition) and TTS (Text-T... Read More about Two Alternative Frameworks for Deploying Spoken Dialogue Systems to Mobile Platforms for Evaluation “In the Wild”.

TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction
Presentation / Conference Contribution
Strathearn, C., Yu, Y., & Gkatzia, D. (2023, March). TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction. Presented at 'HRCI23, Stockholm, Sweden

The most effective way of communication between humans and robots is through natural language communication. However, there are many challenges to overcome before robots can effectively converse in order to collaborate and work together with humans.... Read More about TaskMaster: A Novel Cross-platform Task-based Spoken Dialogue System for Human-Robot Interaction.

Training an adaptive dialogue policy for interactive learning of visually grounded word meanings
Presentation / Conference Contribution
Yu, Y., Eshghi, A., & Lemon, O. (2016, September). Training an adaptive dialogue policy for interactive learning of visually grounded word meanings. Presented at 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Los Angeles, US

We present a multi-modal dialogue system for interactive learning of perceptually grounded word meanings from a human tutor. The system integrates an incremental, semantic parsing/generation framework - Dynamic Syntax and Type Theory with Records (DS... Read More about Training an adaptive dialogue policy for interactive learning of visually grounded word meanings.

MoDEsT: a Modular Dialogue Experiments and Demonstration Toolkit
Presentation / Conference Contribution
Yu, Y., & Oduronbi, D. (2023, July). MoDEsT: a Modular Dialogue Experiments and Demonstration Toolkit. Presented at CUI '23: ACM conference on Conversational User Interfaces, Eindhoven, Netherlands

We present a modular dialogue experiments and demonstration toolkit (MoDEsT) that assists researchers in planning tailored conversational AI-related studies. The platform can: 1) assist users in picking multiple templates based on specific task needs... Read More about MoDEsT: a Modular Dialogue Experiments and Demonstration Toolkit.

The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues
Presentation / Conference Contribution
Jiang, Y., Xu, Y., Zhan, Y., He, W., Wang, Y., Xi, Z., Wang, M., Li, X., Li, Y., & Yu, Y. (2022, June). The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues. Presented at Thirteenth Language Resources and Evaluation Conference, Marseille, France

We describe a new freely available Chinese multi-party dialogue dataset for automatic extraction of dialogue-based character relationships. The data has been extracted from the original TV scripts of a Chinese sitcom called “I Love My Home” with comp... Read More about The CRECIL Corpus: a New Dataset for Extraction of Relations between Characters in Chinese Multi-party Dialogues.

A Visually-Aware Conversational Robot Receptionist
Presentation / Conference Contribution
Gunson, N., Garcia, D. H., Sieińska, W., Addlesee, A., Dondrup, C., Lemon, O., Part, J. L., & Yu, Y. (2022, September). A Visually-Aware Conversational Robot Receptionist. Presented at 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Edinburgh

Socially Assistive Robots (SARs) have the potential to play an increasingly important role in a variety of contexts including healthcare, but most existing systems have very limited interactive capabilities. We will demonstrate a robot receptionist t... Read More about A Visually-Aware Conversational Robot Receptionist.

Combining Visual and Social Dialogue for Human-Robot Interaction
Presentation / Conference Contribution
Gunson, N., Hernandez Garcia, D., Part, J. L., Yu, Y., Sieińska, W., Dondrup, C., & Lemon, O. (2021, October). Combining Visual and Social Dialogue for Human-Robot Interaction. Presented at 2021 International Conference on Multimodal Interaction, Montréal, QC, Canada

We will demonstrate a prototype multimodal conversational AI system that will act as a receptionist in a hospital waiting room, combining visually-grounded dialogue with social conversation. The system supports visual object conversation in the waiti... Read More about Combining Visual and Social Dialogue for Human-Robot Interaction.

All Outputs (31)