Disruption Works Chit Chat

Making Voice Automation Human

January 24, 2022 Disruption Works Season 1 Episode 22
Disruption Works Chit Chat
Making Voice Automation Human
Show Notes Transcript

We have all had really poor experiences with Automated voice phone calls, but the technology has moved on and we should no longer accept that as the standard. 

We discuss today how and what we look at when making a voice automation more human. From conversational nuances to on brand voices.

Our latest series of podcasts, concentrates on voice and how that is going to impact the next few years with tips along the way. Find out more about voicebots here and if you have any subjects that you would like us to discuss then email info@disruptionworks.co.uk with the subject Podcast and we will see what we can do ;-)

00:00:14.910 --> 00:00:21.830
 Sean Bussell
 Everyone and welcome to another exciting edition of chit chat with Sean and Steve from disruption works Steve. How you feeling today.

00:00:22.020 --> 00:00:29.570
 Steve Tomkinson
 Yeah, not too bad. Not too bad. A bit of a great day today, but managed to get a bit fresher and did you take the dog out this morning?

00:00:30.580 --> 00:00:37.810
 Sean Bussell
 Took a while, he goes out every morning but I I always take him out on the weekend of trying that absolutely knock him out.

00:00:32.730 --> 00:00:33.370
 Steve Tomkinson
 Yeah.

00:00:38.690 --> 00:00:42.110
 Sean Bussell
 That's always, that's always the task of the weekend if I'm honest.

00:00:42.440 --> 00:00:43.190
 Steve Tomkinson
 Yeah, really.

00:00:42.880 --> 00:00:54.660
 Sean Bussell
 That's a yeah. Just while it's winter, you know it's nice. It's not so nice as it go for a long walk after work and it's dark. So the weekend is the the time and the place to try and destroy him and basically.

00:00:50.050 --> 00:00:50.360
 Steve Tomkinson
 There.

00:00:54.050 --> 00:00:57.260
 Steve Tomkinson
 It's not nice dog day, and it's good.

00:00:56.770 --> 00:01:21.440
 Sean Bussell
 Yeah, he's a good egg. He's a good egg. OK, let's jump into it Steve. And this week I want to re pick up the topic of voice automation now. I suspect most people listening to this will have two experiences of voice automation. They'll have the really bad ones that you know, nuisance call, let's say and say, oh, you were in an accident. Press 1 to confirm. And then.

00:01:21.150 --> 00:01:26.680
 Steve Tomkinson
 Those are lovely impression by the way, Sean pressure. It's no. It's not really like it.

00:01:22.850 --> 00:01:25.400
 Sean Bussell
 Yeah, you like that one. I did the I did the voice.

00:01:27.470 --> 00:01:48.500
 Sean Bussell
 And the other the other end of the spectrum, I guess, is peoples experience with with home hubs and home assistants like Google Assistant for example. So question one is how do? How do we do the the the stuff that's more geared towards what Google sounds like but also how do we also make it on brand?

00:01:37.110 --> 00:01:37.470
 Steve Tomkinson
 Yeah.

00:01:49.420 --> 00:01:55.860
 Steve Tomkinson
 Well, First off, you've got conversation flow, which is important and actually.

00:01:56.660 --> 00:02:27.800
 Steve Tomkinson
 The the challenge that told the smart speaker Assistance of God at all. Just you know, you've got Bixby. You've got Google assistance. You've got Lexie ago. Love them and they have particular problems because they're very general. Now. Normally, if we're doing so, this for a business, then we don't have to be quite so general. We know what the processes that we're automating. 'cause this is, voice automation were doing. But also we can put the nuances in because we control the conversation a little bit better.

00:02:28.360 --> 00:02:43.890
 Steve Tomkinson
 But if we've got so, so we've got an outbound call which was maybe the nuisance when he was talking about. But if you are introducing yourself so high, S is disruptionworks just calling you about an appointment, then?

00:02:44.670 --> 00:03:15.180
 Steve Tomkinson
 Somebody is not is going to go like you would just so sorry who was that so you know that kind of thing and that could trip automations up quite quickly and quite easily. And all those nuances have to be handled and that is part of the design of making something much more human to to cope with that and coming back with her. Oh it's disruptionworks weird that we have an appointment with you so that phrasing in that kind of conversation it says natural as possible.

00:02:51.150 --> 00:02:51.530
 Sean Bussell
 Yeah.

00:03:15.410 --> 00:03:17.330
 Steve Tomkinson
 So you understand what somebody.

00:03:18.220 --> 00:03:49.060
 Steve Tomkinson
 Just asked again and and it hasn't kind of tripped up your script, it's you. You've included that script and in the conversation and it's an important part of it because you will have a lot of unhappy path conversations that have to be handled in their natural way. You know you can't expect somebody to be to do exactly what you want them to do, you know we've got a classic demo that we do, which is, you know somebody is trying to decide how many people are going to be.

00:03:49.120 --> 00:03:51.090
 Steve Tomkinson
 At an appointment and.

00:03:51.980 --> 00:03:58.880
 Steve Tomkinson
 That needs a waiting time, you know needs to be weighted so you know DD automation will go.

00:03:59.660 --> 00:04:21.730
 Steve Tomkinson
 How many people are you having on their property and then the person on the airlines again hold on a minute and I never got A and instead of it going I I got three, you know something you know, kind of trying to understand that that what the number is and that they go no hold up, that's fine and it goes into a waiting pattern and you just go fine. Take all the time you like.

00:04:09.930 --> 00:04:10.740
 Sean Bussell
 Yeah.

00:04:22.400 --> 00:04:33.270
 Steve Tomkinson
 And so, and then it's looking for the key slot data to then go on right. It's flora need and then it goes great. OK, I'll book it for for them.

00:04:34.480 --> 00:05:05.280
 Steve Tomkinson
 All of a sudden you've got so very natural conversation because you're allowing the on the happy path, so you're allowing that kind of interrupted process into your conversation flow without it being a problem for the user. So that's really important in the whole human feel about a a voice automation. You know we're not trying to fool them into thinking it's a human would never do that, 'cause that's the your own. A highway to nowhere there, but we try and let them forget.

00:05:05.440 --> 00:05:11.650
 Steve Tomkinson
 Today, speaking to an automation, you know that that's the that's the challenge. That's that's what we're trying to do.

00:05:11.960 --> 00:05:33.750
 Sean Bussell
 And what you're talking about there is is something that's highly sophisticated that can interpret complex sentences, whereas I think the lower end of the scale when we talk about voice automation, those calls about well, what it was, PPI claims and stuff back in the day. Or you've been in accident. They're they're only looking for a yes or a no. And then they're going to put you through to a human. They're not. It's it's not.

00:05:18.550 --> 00:05:18.830
 Steve Tomkinson
 Yep.

00:05:27.920 --> 00:05:28.170
 Steve Tomkinson
 Yep.

00:05:30.450 --> 00:05:30.810
 Steve Tomkinson
 That's.

00:05:32.970 --> 00:05:33.950
 Steve Tomkinson
 Now there's nothing else.

00:05:34.050 --> 00:05:46.410
 Steve Tomkinson
 You know that and and of course you'll have a lot of informed questions or a lot of informed requests, so you know like take the appointment setting again. So I'd like an appointment for 10:00 o'clock on Friday. Please, you know and.

00:05:47.300 --> 00:06:15.300
 Steve Tomkinson
 That's an informed question. So OK, let me have a look. I'll look for right, you know. And we'll look at and see if I've got something for you. It's not reiterating. 10:00 o'clock on Friday. It's got that. So yeah, I do have appointment for 10:00 o'clock on Friday. Or I have another appointment at 11:00 o'clock on Friday. As that? How's that sound, you know, it's interpreting the context. It's getting all the pieces that it needs. And then it's asking for any information that it's not got, you know?

00:06:15.090 --> 00:06:45.510
 Sean Bussell
 Yeah, so. So what we're saying here is if the bot knows it needs 5 slots of information, let's say and you try to give it three, it doesn't go. You want an appointment on Friday and then it's missed the other bits. It didn't actually capture there this Friday 10:00 o'clock. Please for you know, an eye appointment. Or will you know whatever information you gave if it could fill up those slots? Quite? I know I've got those bits info, but I haven't got is, you know this or that and then it can ask you those particular questions.

00:06:20.950 --> 00:06:21.330
 Steve Tomkinson
 Yeah.

00:06:26.850 --> 00:06:27.490
 Steve Tomkinson
 Yeah, that's.

00:06:32.940 --> 00:06:33.300
 Steve Tomkinson
 Yeah.

00:06:36.330 --> 00:06:36.690
 Steve Tomkinson
 Yeah.

00:06:45.550 --> 00:07:15.890
 Steve Tomkinson
 Yeah, So what? The appointments for what do you appointment type? Is that type of thing? You know it might be that that's an extra bit. That person as mentioned, so you haven't got that piece and then it just fills in. So that's like an informed conversation flow which she would add. You would be able to handle as an agent. You might confirm that back and that human agent would confirm that back because they want to make sure they got it right, so the automation would do the same. So there's no unnatural flowing going. OK, so

00:07:16.200 --> 00:07:20.020
 Steve Tomkinson
 was for an employment 10 at 11:00 o'clock on Friday.

00:07:21.300 --> 00:07:44.120
 Steve Tomkinson
 And that is that correct? You know, like, Oh yeah, that's right, great. OK, it's booked and do you want me to send you an email blah? You know that something is is that you know the process it'll go through, so that's how you can make it as human humanistic own approach as possible. The flow is exactly on, you know, and that's that's what we're trying to achieve.

00:07:24.370 --> 00:07:24.710
 Sean Bussell
 Yeah.

00:07:29.650 --> 00:07:30.110
 Sean Bussell
 Yeah.

00:07:44.350 --> 00:08:00.960
 Sean Bussell
 OK, so in terms of how people hear this stuff, and as I referenced earlier in the call that I guess the the the top end of voice automation that people are familiar with is Alexa and Google Assistant. So how do we get something to sound as natural as that or even more natural?

00:08:01.320 --> 00:08:33.430
 Steve Tomkinson
 While you've got some a lot of options with that, so there's something called TTS Now, which is text to speech and TTS. You've got lots of standard libraries as, so you can have a UK English. So say take Google system for instance. You'll have UK English, male female. You have couple of options, so you can prefer voice one female UK and it'll have a UK accent and that will be the standard TTS that you could use so you can do a lot of optimization in that.

00:08:33.480 --> 00:08:42.710
 Steve Tomkinson
 Just in the way there you go, I like that particular voice, so if you haven't got a particular brand voice, we haven't got a voice artist that you use then.

00:08:43.410 --> 00:09:13.410
 Steve Tomkinson
 There's options there, and there's several libraries, so Google have one door builder. There's lots of other TTS is there are there out in the and the wild few to choose from, you know. So any of those can be used in in our service, so you just choose one. But then what we do if we've got something that's you know reasonably straight on and we you've got a voice artist that you're trying to use. So if you've got a a campaign on.

00:09:13.700 --> 00:09:26.630
 Steve Tomkinson
 Uh, I'll TV. And you want that voice artist that does the voiceover? It's do your answering the phone and do that as part of the voice automation or make the outbound call, whichever that may be.

00:09:27.380 --> 00:09:36.720
 Steve Tomkinson
 Then we can use a couple of methods to do that, and the simplest method is clean cut tax, so we actually get the voice artist into a.

00:09:36.770 --> 00:09:43.500
 Steve Tomkinson
 Yeah, a studio doesn't take very long to do. We have a set of words. This is all.

00:09:44.310 --> 00:09:50.080
 Steve Tomkinson
 Bread and butter work for a voice artist. They'll do this and they know what they're doing, so this is no challenge for them.

00:09:50.760 --> 00:10:20.710
 Steve Tomkinson
 A professional voice artist will have done this before anyway. Bowl we do is we go. We need these words. We need these phrases we need these sounds we need all these intonations and we basically spend you know maybe an hour and then pursued over them and then we do the post production to make that all fit into the conversations due to flow and everything that we need to do there and then. That part is where we use our expertise to to.

00:10:20.760 --> 00:10:23.980
 Steve Tomkinson
 To get the conversation flow right now, it's clear context.

00:10:24.720 --> 00:10:27.970
 Steve Tomkinson
 So that's pieces, chunks of Tacoma text.

00:10:29.290 --> 00:10:34.370
 Steve Tomkinson
 And that's the simplest way of doing it, and it gives us the really natural.

00:10:35.060 --> 00:10:40.610
 Steve Tomkinson
 A conversation that you will appear on our demos. Usually you know, so it's pretty straightforward.

00:10:38.920 --> 00:10:39.400
 Sean Bussell
 Yeah.

00:10:40.960 --> 00:10:50.160
 Sean Bussell
 And if you are a huge brand and you can match that voice app, I'm thinking some of the ones you know like Marks and Spencers. Maybe you might not want that Lady answer in the fight.

00:10:50.740 --> 00:11:21.600
 Steve Tomkinson
 Not, maybe not, but they might take you know, or they may have a voice in mind, or a day. You know they may have a voice artist they use for other things you know. So a lot of people will have a voice artist that might just be there. There kind of kind of their answerphone message, but in that color zone you know, and that's that's not a problem. But there's also a bit. There's another one as well, which is on top of that. So I mentioned TTS earlier, so text to speech service. Now there's flexibility in that in the fact that you can then.

00:10:58.280 --> 00:10:58.760
 Sean Bussell
 Yeah.

00:11:07.700 --> 00:11:08.320
 Sean Bussell
 Yeah.

00:11:21.900 --> 00:11:24.740
 Steve Tomkinson
 Use that for anything, so that's why.

00:11:25.850 --> 00:11:37.630
 Steve Tomkinson
 Google Assistant and and whatever will actually use a TTS service, because then it can answer any questions and it can answer any words and any ill phrase. It pretty pretty on the bottom.

00:11:38.360 --> 00:11:46.150
 Steve Tomkinson
 Uh, so those phrases are delivered by all the little vowel sounds and things like that. Do you have an English language?

00:11:47.350 --> 00:11:54.730
 Steve Tomkinson
 So we can actually do that as a customized TTS for a brand as well, so they have their voice.

00:11:55.760 --> 00:11:58.000
 Steve Tomkinson
 Not for any use at all.

00:11:57.730 --> 00:11:58.640
 Sean Bussell
 Yeah, yeah.

00:11:58.850 --> 00:12:26.900
 Steve Tomkinson
 And there's a little bit more work in it, but actually it's not as complicated as it sounds, so you know it. You might get you know 2-3 hours of voice, artist time and and the post production takes a little bit longer, but it it means then there's a much more flexible service, and it's so you've got a customized TTS that's got your brand voice on there. You can use it for any then future conversations and future conversation flows that you where you may want to automate.

00:12:27.730 --> 00:12:31.160
 Steve Tomkinson
 You know it's really useful kind of asset in its own right.

00:12:27.850 --> 00:12:28.300
 Sean Bussell
 Alright.

00:12:32.410 --> 00:12:39.640
 Sean Bussell
 Good stuff, good stuff. OK, well I think that's it from me on the subject Steve. Is there any final bits that you wanted to add?

00:12:40.140 --> 00:12:51.820
 Steve Tomkinson
 No, I mean, it's just that you know that this is it. We have to think about the voice automation in this way. Now you know you've you did your admirable impression earlier.

00:12:50.990 --> 00:12:51.440
 Sean Bussell
 Hey.

00:12:52.850 --> 00:13:13.840
 Steve Tomkinson
 How it cannot be, you know, and that is the difference. Now these are. This is the the difference that we expect to deliver for voice automation. We do not expect to robotic voice. We do not expect a a conversation flow that can't handle and natural flow of conversation. You know, without the errors and arms and.

00:13:14.580 --> 00:13:19.890
 Steve Tomkinson
 No, uh, maybe I'm not sure. That's just how we talk so.

00:13:20.860 --> 00:13:44.180
 Steve Tomkinson
 You know, don't expect it to be a natural flowed common of conversation without those pieces in there, you know you can't expect it to be like that. You know we will ask like an agent. Ask the questions. We will pre fill a conversation with the information that's given, like a human agent will you know and we try and get close to that kind of experience as possible.

00:13:45.860 --> 00:13:54.070
 Steve Tomkinson
 And you know, and it will sound absolutely spot on, you know. So that's what the quality and the.

00:13:55.820 --> 00:14:09.110
 Steve Tomkinson
 And the expectation should be there, you know. And I have to say there aren't many dare doing as nicer job as a silent. You know they said I was minus Rob Romeo and trumpet. There exists is what we're here for but.

00:14:09.730 --> 00:14:36.910
 Steve Tomkinson
 You know it is actually one of those things that needs to be the standard. You know. Expectations are very high. You know assistant does sound great, you know. So just the arms and Max are. So all those things sound great and the expectation is that if you're a brand of any note, then your stuff should sound as good. And if you can get it there, people will use it and not even thinking about it. You know it, there's no. It's only a positive message, not a negative.

00:14:37.560 --> 00:14:47.530
 Sean Bussell
 And and it's always worth mentioning, even though we build in these nuances in human speak. If you look at the speech, if you like around the teams and the hours, and I'm not sure, and I don't know.

00:14:47.850 --> 00:14:48.190
 Steve Tomkinson
 Yeah.

00:14:48.320 --> 00:14:51.400
 Sean Bussell
 There's always a way out if someone doesn't want to speak to the bot.

00:14:52.040 --> 00:15:21.970
 Steve Tomkinson
 Oh yeah, absolutely handovers. Important, you know. And that's like you said, it is important and all we're trying to do is take, you know 8090% of of the calls that would come in. You know, 'cause there's always gonna be some outliers that are going to be a complicated mix of call the and the the. The automation isn't programmed for, or it's it can't really handle, you know, there's no self service journey for it because it needs a human agent to do some intuitive thing.

00:15:22.040 --> 00:15:42.580
 Steve Tomkinson
 But you know, we'll always be in that zone to hand over, you know, very much so to either staff or to contact centers or whatever it may be, you know, but it's it's reducing costs. It's been more efficient, it's more available. It's all those things. And and it's all written. There never has a bad day. You know, it's simple as that.

00:15:43.410 --> 00:16:10.480
 Sean Bussell
 It it it is a case of just deliver it like say delivering an on brand message always and it just to completely eradicate school wait times which is always a big pain even that in itself and the speed of response is a big marker against customer experience. If you're slow to respond it doesn't matter if once they get through they get the best service in the world. The fact is if they waited 20 minutes they waited 20 minutes and that's annoying.

00:16:01.140 --> 00:16:01.550
 Steve Tomkinson
 Yeah.

00:16:11.370 --> 00:16:22.520
 Steve Tomkinson
 And you know what? I think now we've come through the other side of a pandemic. And there's been a lot of tolerance through the pandemic. And but now, if you can shine the others.

00:16:22.030 --> 00:16:23.640
 Sean Bussell
 Well, I don't know if that's right.

00:16:23.850 --> 00:16:30.060
 Steve Tomkinson
 No, not well. OK, I think we are. I think you know, I think we're seeing seem to be coming out either. And it's less than.

00:16:28.970 --> 00:16:35.900
 Sean Bussell
 No, no, I mean more in terms of the tolerance or see lots of stuff where people have been angry at this speed of response and everything.

00:16:34.450 --> 00:16:34.870
 Steve Tomkinson
 Well.

00:16:35.670 --> 00:17:02.900
 Steve Tomkinson
 Well, like a day off and I think, but I think the Thomas has low now because it's got so bad because staffing is really precious. Staffing is really low so they're not responding literally. Just turning phone lines off and things like that. If you're a brand that doesn't have to do that because you've thought about this smarter, you're going to win. It's just that simple. And if you'd not in that, if you don't realize that, then you know you're the one that's not going to win.

00:17:03.610 --> 00:17:16.360
 Steve Tomkinson
 It's that simple. You know this has to be part of a normal comms package. Now I feel you know it, it's the technology is there to be used like we've talked about. We can make it really human.

00:17:17.240 --> 00:17:37.240
 Steve Tomkinson
 I will, why are you not using? You know it's cheaper than contacts entered by multiples and it's and it's always there, you know so and it can be expanded and expanded on a spanner dump. You know you start and a simple journey, hand arrest over then you just keep adding journeys and it's that simple, you know. So there's always a good starting point.

00:17:38.040 --> 00:17:47.800
 Sean Bussell
 Get stuff alright. Well, thanks for that, Steve. I hope everyone in a uh enjoyed that and give us some food for thought and and we hope to catch you next time. Thanks very much.

00:17:47.960 --> 00:17:48.990
 Steve Tomkinson
 Alright, cheers now.