
RCSLT - Royal College of Speech and Language Therapists
This is the official podcast of the Royal College of Speech and Language Therapists - RSCLT. We were established on 6 January 1945 to promote the art and science of speech and language therapy – the care for individuals with communication, swallowing, eating and drinking difficulties.We are the professional body for speech and language therapists in the UK; providing leadership and setting professional standards.We facilitate and promote research into the field of speech and language therapy, promote better education and training of speech and language therapists and provide information for our members and the public about speech and language therapy.
RCSLT - Royal College of Speech and Language Therapists
What is the role of Artificial Intelligence in Augmentative and Alternative Communication (AAC)
What is the role of AI in AAC (Augmentative and alternative communication)?
In this episode Professor Annalu Waller, Professor of Human Communication Technologies at the University of Dundee and Alan McGregor, ex UK team paralympic swimmer and honorary researcher at the University of Dundee take us through what it can do now and developments they'd like to see.
We cover:
- What is AI's role in AAC?
- How can we use generative AAC to expand on communication?
- What are some of the ethical considerations?
- What Annalu and Alan would like to see in the future.
Interviewees:
Professor Annalu Waller, Professor of Human Communication Technologies at the University of Dundee and lead for the Augmentative and Alternate Communication Research Group at the university.
Alan McGregor, ex UK team paralympic swimmer, part of the Straight Talking Group – and honorary researcher at the University of Dundee.
During the conversation Alan is supported by his assistant, Cindy Macfarlane.
Resources:
Here are some links which relate to AI and AAC:
- https://discovery.dundee.ac.uk/en/publications/use-of-artificial-intelligence-ai-in-augmentative-and-alternative
- https://discovery.dundee.ac.uk/en/publications/telling-tales-unlocking-the-potential-of-aac-technologies
- https://discovery.dundee.ac.uk/en/publications/blending-human-and-artificial-intelligence-to-support-autistic-ch
- https://discovery.dundee.ac.uk/en/publications/personal-storytelling-using-natural-language-generation-for-child
- https://discovery.dundee.ac.uk/en/publications/evaluating-the-standup-pun-generating-software-with-children-with
Groups:
- Dundee Accessibility and Assistive Technology Research Group: https://aac.dundee.ac.uk/
- Straight Talking Group: https://aac.dundee.ac.uk/stg/
Please be aware that the views expressed are those of the guests and not the RCSLT.
Please do take a few moments to respond to our podcast survey: uk.surveymonkey.com/r/LG5HC3R
Transcript Name:
artificial-intelligence-and-aac
Transcript Date:
18 November 2024
Speaker Key (delete/anonymise if not required):
HOST: JACQUES STRAUSS
ANNALU: ANNALU WALLER
ALAN: ALAN McGREGOR
CINDY: CINDY MACFARLANE
MUSIC PLAYS: 0:00:00-0:00:05
HOST: 0:00:05 Welcome to the podcast of the Royal College of Speech and Language Therapists. This is the first in a series of episodes in which we are going to take a closer look at Artificial Intelligence and the role that it can play in speech and language therapy, as well as some of the potential problems.
In this episode, we are going to take a closer look at the role of AI in augmentative and alternative communication and how this field has long played a pioneering role in the use of novel and cutting-edge technologies.
We were joined by Annalu Waller, Professor of Human Communication Technologies at the University of Dundee, and Alan McGregor, Honorary Research Assistant, also of Dundee University. Alan was assisted by [Cindy McFarlane 0:00:51], who we will hear from later.
I started by asking our guests to introduce themselves.
ALAN: 0:00:59 My name is Alan McGregor. I am an Honorary Research Assistant at Dundee University. I help out at university to create new software for non-speaking people. I also give talks to students about being a non-speaking person and the troubles we face in everyday life.
For a person who cannot talk, there are a few different AAC options. The device I am using here is an iPad with an application called Snap, which has a whole range of topics and menus which have programmable buttons and a keyboard page where I can type and use predictive text and have it spoken out loud.
I sometimes also use a word board where I can point to words to form a sentence. It can work well for one-to-one communication, but doesn’t work so well in a group setting.
Sometimes, I use a little bit of sign language and gesturing, and I can try to vocalise a little bit of words here and there.
EMILY: 0:02:21 Hi, my name is Annalu Waller and I’m a Professor of Human Communication Technologies at the University of Dundee, where I direct the AAC research group and have access to loads of wonderful people who use AAC, who are our expert user group and who direct what we do and what we create.
HOST: 0:02:59 I wonder if we can start off with a very broad question to both of you: What do you see as the potential role of AI in this field?
ALAN: 0:03:11 My idea for AAC software using AI would be a program that could listen to the conversation I am currently involved, pick up on environmental clues such as GPS location, and use what I have said about the current topic in the past to form a few brief sentences that I can select from. [AI 0:03:39] would also know who I was speaking to and look at last messages I have sent to that person.
The way I could know what news stories I have read online recently and be able to condense them if that was the current topic of conversation. This could all add up to an AAC user being able to join in a conversation much quicker than we are able to at the moment. This would help us to be able to say what we are trying to say much easier.
HOST: 0:04:13 At this point, I should say that Alan had to spend some time preparing some answers in advance, and that the technology doesn’t allow Alan to respond to my questions this quickly in real-time. In this podcast, we have shortened response times for the sake of brevity.
ALAN: 0:04:30 With my communication device, I can speak at around 10 words per minute. Someone who can talk without difficulty can manage around 150 words per minute.
HOST: 0:04:45 Annalu has been in this field for a long time, so I was keen to know what she made of the now widespread interest in AI.
ANNALU: 0:04:56 In my area, AI is not new, and this is something I keep on reminding people that. Within AAC particularly, we’ve had AI support since the late 80s. And that’s the focus that the research at Dundee has taken is exactly what Alan was talking about – this desire, this need to have communication that is quick, effortless, and gives you what you want to say and not what somebody else thinks you want to say.
In terms of AI, we can either speed up that communication, and that’s what we’ve been doing with word prediction, with sentence and story prediction as well. But that, in fact, needs an interface which offers users access to those bits of conversation. And I think it’s that interface that poses the most difficult challenge for us.
So, we can get the words, we can get the sentences, we can know where we are, and a lot of AAC devices are beginning to use geo locations to actually inform what the user might want to say next. So, if I’m a coffee bar, I might want to order a latte. But that’s easy. That’s predictable. Even you can predict that if I’d be in a coffee bar I would want to order a drink.
When it comes to the more interesting bits of conversation, we have a problem, because the AI does not necessarily know where we were, who we were talking to, and what we were talking about.
So, a lot of my work has been looking at trying to sense that data automatically. Say, in a school, I know where the child is, I know who they are interacting with, I know what objects they are playing with, and I can then generate sentences to help a child describe what they did at school today.
So, we are then generating text by AAC users. And this is where generative AI comes in. But whose voice is that? Is it my voice? Is it Alan’s voice, or is it just the conglomeration of the common denominator of so much data.
Basically, AI can give us two things. It can speed up our current conversation, but it can also expand and give us more information in text form, if we want.
HOST: 0:09:14 So, there is the problem with the interface, which is a real challenge. We also want to take the technology beyond the purely functional, so not just ordering coffee, but having a meaningful conversation. This introduces a new problem, whose voice is it? Because gen AI is merely the aggregate of available information. To what extent does it reflect the user, their history, and their thoughts?
ALAN: 0:09:40 Software [inaudible 0:09:38] know my life.
CINDY: 0:09:49 The software doesn’t know about your life.
ALAN: [Inaudible 0:09:52].
CINDY: 0:09:53 But grows to know. It gets to know you the more you [inaudible 0:09:58].
ANNALU: 0:09:59 I think, Alan, what we talked about previously was the fact that your support workers know you so well that they can immediately expand or interpret your communication to people who don’t know you that well. The holy grail is, how can we get a computer to actually have all of that knowledge about you, and how can we then present that information in a way that you can actually understand what’s there and be able to use it within interactive conversations?
ALAN: 0:11:05 I know it hard, but I think we need to try. I think software needs six months to know my life.
ANNALU: 0:11:21 What frustrates me as a researcher is that we’ve demonstrated how this technology can work, but to get it into commercial systems requires developers to change the paradigm, the way in which they provide that communication for end users.
For example, the idea of storing what you’ve said already is being done by different developers, but the efforts it takes for somebody to actually think, I’ve said this before; where can I go to find it? And then I’ve got to look through all my past history to locate what I want, and then it isn’t quite the way I want to say it in that conversation. So, the time it takes for me to do that, the conversation has gone, has run away with you.
What we are demonstrating at the moment in our latest prototype is to try and get away from having the user to consciously think about going to different places [in their programme 0:13:11], and I want those words or those pictures or those icons to come up where you are looking.
Some of my work has been talking about, do we actually want a visual interface anyway? Can we get away without an interface in the future? Because people need to look at each other to communicate and interact, not through a machine.
HOST: 0:13:51 Alan, I’m curious about whether you’re both excited by the possibilities but also frustrated, given the work that Annalu has described over all of these years. Like, why have we not made more progress?
ALAN: 0:14:06 I now same quick speed talk past 30 years?
CINDY: 0:14:15 Yeah, Alan says he’s a little upset that it’s the same speed now as it was… as it has been for the last 30 years.
ALLANOU: 0:14:26 Fundamentally, nothing has changed. I think we have the most amazing developers out there, the most amazing companies who are really invested in the good of the user, which is so different to other companies. They are wonderful to work with, but they are very small. So, their ability to innovate and to take the research out from the universities or other research labs… And the risk of changing the way AAC is delivered is huge. And we don’t have the level of investment that other major tech companies have. We are getting more interest from the big tech companies like Microsoft and Google and Apple who really see something in the arena of accessibility.
For example, one of the projects that companies are collaborating on, which is unheard of, is on voice recognition for speech like mine or Alan’s. So, this atypical speech pattern, they are now working to build out dataset. But because we are such a small minority, it’s really difficult to build up those big datasets for projects like that.
Another problem is people keep on asking me for AAC dataset, and I say, why? Because we are not trying to replicate what people can’t do already. We are trying to replicate the level of typical interaction that people engage in without thinking. We know what to do, we just don’t have the infrastructure or the funding to have a concerted effort of pulling all that we know together into a system and an interface that might work.
There’s a difference between getting funding for research and getting funding for implementation. You will see a lot of the research from Dundee in commercial systems, but they’re not taking the AI with it.
HOST: 0:17:58 One of the other issues that Annalu and Alan raised is that because this is a small and specialised research community, there is often a loss of institutional knowledge. The same ideas will resurface, and work that has been done previously may be forgotten. So, if we are to see significant advancement in AAC, there are technical challenges such as the interface, but also more systemic problems, such as the size of the medical device sector that works in the area, funding for implementation of research, and knowledge retention.
ANNALU: 0:18:32 The other problem is that very few developers have the skills to involve disabled users in their design process. So, instead of having people like you, Alan, as part of a research group, they employ what they call ‘proxy users’ – people who try and be like us. And nobody can move their hands quite like I do.
HOST: 0:19:12 I think you’d mentioned before when we spoke about the complex issue of standardisation.
ANNALU: 0:19:18 The problem is every time you start a new system, you have to start from scratch, so there’s no standardised way of taking the data and the history and your settings from one system to another. And colleagues and I think that the lack of standardisation is also partly to blame for the stagnation in the development of new ways to work.
HOST: 0:20:05 I then asked Annalu and Alan an offhand question about medical device companies. It took us in an unexpected direction and made me consider the question of AAC in a completely new way – one that I would never have considered before.
ANNALU: 0:20:22 I have a real issue about looking at this [inaudible 0:20:28] medical model. I think AAC technology, assistive technology, in the wider sense needs to be a social tool. It’s not there because someone is disabled. It needs to be accessible by everybody, even if they are disabled.
So, we were talking about design and why this is a small area for the big science. Actually, if you make everything more accessible and you provide access to AAC tools, you benefit everyone. Things like word prediction. We all use word prediction now on our phones because we are disabled by this situation. So, it means that an assistive technology in the 80s and 90s is now a mainstream solution for everybody.
And this is why I think we’ve managed to get the big giants on board because they see this in a wider context. But they are still having to convince their management structure that this is an area that needs investment, so that not only do we provide support for people with visual and hearing and physical abilities, we’re also looking now at how we support people who are not as literate as they might be or have intellectual difficulties that means big pieces of information are difficult to understand.
Again, we are working within the bigger arena to try and make sure that, with AI, we don’t leave disabled people behind – especially people with communication and physical and some level of intellectual impairment as well, because we’re all unique. So, I need to support those people who have learning disabilities as much as I support someone who uses AAC or a screen reader.
As a computer scientist and assistive technologist, I have got fingers in every pie conceivable, because everybody needs assistive technology to work wherever they are.
I have always been of the mind that AAC is bigger than what we think it is. So, Alan and I teach all second year medical students every year. And part of their teaching is to identify that the skills and the resources that AAC gives us helps us to communicate across every sphere. People with a mental health issue who might be too anxious to talk, we can use it there. An immigrant without English – we need to use the same strategy. Someone who has been a victim of domestic abuse and who can’t talk about it needs the skills we can give and provide. We need to empower others to take on the AAC.
ALAN: 0:25:21 I think need all people learn about can’t talk.
ANNALU: 0:25:30 And that makes us human beings with this ability to interact and share who we are with each other.
HOST: 0:25:46 Before we end the podcast, there is something important I should reveal that listeners may or may not have noticed. In this podcast, every word I said, including this, was created with a text to speech generator using a clone of my voice. It was created in a programme called ElevenLabs, which you may be familiar with, and a tool we will discuss in a future episode, though I suspect both Alan and Annalu consider voice cloning fairly low-hanging fruit, as one might say.
As always, if you want further information, see the show notes for links.
A very big thank you to Alan and Annalu for their time today. We hope you will be able to join us for the next episode about Artificial Intelligence.
MUSIC PLAYS: 0:26:27
END OF TRANSCRIPT: 0:26:42