Wednesday 17 June 2015

Rudely Interrupted!

Have you ever been rudely interrupted? You're part way through saying something of significance (to you at least) and the person you are speaking to barges in with a comment or a question. How do you react? Ignore it and carry on regardless? Deal with their comment/question and return to what you were saying? This was one of the problems we faced in the Companions Project when we developed an animated avatar called Samuela capable of engaging in social conversation (see this post for a very brief overview).

Companions Dialogue System Interface

Occasionally, Samuela would make long multi-sentence utterances commenting on what they user had said about their day at work. Here's an example of one of Samuela's long utterances:
"I understand exactly your current situation. It's right that you are pleased with your position at the minute. In my opinion having more free time because of the decreased workload is fantastic. Meeting new people is a great way to pass the time outside of work. I'm sure Peter will provide you with excellent assistance. Try not to let Sarah bother you either."
These long utterances provided the opportunity for (and often provoked) the user to interrupt the avatar mid speech. We realised that Samuela would need to be able to handle these interruptions and respond to them in a human-like way if she was to engage in believable social conversation with the user. A detailed description of how we implemented this barge-in interruption handling facility can be found here (Crook et al, 2012)

In summary, we faced two problems when developing this interruption handling capability. The first was detecting the occurrence of genuine interruptions and distinguishing them from back-channel utterances from the user (e.g. 'Aha', 'Yes' etc). The second was to equip the system with human-like strategies for responding to them in a natural way and continuing with the conversation.

If the user starts speaking whilst Samuela (denoted as ECA in the figure below) is speaking, then the system uses thresholds in both the intensity and sustained duration of the audio signal from the user's microphone to determine if this counts as a genuine interruption. This is illustrated in the schematic below, which shows 4 cases of Samuela speaking, two of which (cases 3 and 4) are designated interruptions by the system:


The second challenge, which was to equip Samuela with strategies for responding to user barge-in interruptions, required us to understand more about the strategies that humans use in such situations. To gather information about this we analysed some transcripts of the BBC Radio 4 programme Any Questions.  This is a discussion programme consisting of a panel of pubic figures, including politicians who regularly interrupt each other - so this was a rich source of examples for us!

In brief, our analysis showed that two things were happening when panelists were interrupted, the first was to address  the interruption, the second was the resumption  or recovery of speech after the interruption. We found it necessary to classify the types of interruptions that we observed, and focussed on implementing the 6 that were found to be most common. We then classified the types of recovery that we observed for each type of interruptions and then implemented these in the system controlling Samuela.

Here are a couple of examples of Samuela responding to user interruptions that are taken from the paper. The down arrow in the system turn (S) indicates the point at which the user (U) interrupted (the remainder of what the system had planned to say is shown in italics). The right arrow shows what the output of the speech recogniser when the interruption occurred.



We were unable to do a full evaluation of the interruption handling before the end of the project, which is a pity because I believe that this is the most sophisticated user barge-in interruption handing system that has yet been developed.


References

Cavazza, M.,  Santos de la Cámara, R.,  Turunen, M.,  Gil, J.,  Hakulinen, J.,  Crook, N.T. and Field, D. (2010)  ‘How was your day?’ An Affective Companion ECA Prototype. In: Proceedings of the 11th Annual SIGdial Meeting on Discourse and Dialogue. Tokyo, Japan. 24th – 25th September 2010.

Crook, N.T., Smith, C., Field, D., Cavazza, M., Pulman, S., Moore, R., and Boye, J. (2012) Generating context-sensitive ECA responses to user barge-in interruptions. Journal of Multimodal User Interfaces, 6 (1), 13-25.

Hakulinen,J, Turunen, M, Santos de la Camara, S. and Crook, N.T. (2010) Parallel Processing of Interruptions and Feedback in Companions Affective Dialogue System. In: Proceedings of Interspeech2010, Makuhari, Japan. 26-30 September 2010.

Smith, C., Crook, N.T., Boye, J., Charlton, D., Pizzi, D., Cavazza, M. and Pulman, S. (2011) Interaction Strategies for an Affective Conversational Agent. Presence, October 2011, Vol. 20, No 5, 395-411.

No comments:

Post a Comment