IJLCD - Six questions, eight years later: identifying early predictors of language development Artwork

RCSLT - Royal College of Speech and Language Therapists

This is the official podcast of the Royal College of Speech and Language Therapists - RCSLT. We were established on 6 January 1945 to promote the art and science of speech and language therapy – the care for individuals with communication, swallowing, eating and drinking difficulties. We are the professional body for speech and language therapists in the UK; providing leadership and setting professional standards. We facilitate and promote research into the field of speech and language therapy, promote better education and training of speech and language therapists and provide information for our members and the public about speech and language therapy.

All Episodes

RCSLT - Royal College of Speech and Language Therapists

IJLCD - Six questions, eight years later: identifying early predictors of language development

December 12, 2025 • The Royal College of Speech and Language Therapists • Season 6 • Episode 19

0:00 | 25:09

Please let us know what you think of this podcast.

In this podcast we chat with Loretta Gasparini about the research she led on finding a robust predictor tool for persistent language disorders. The aim of this research is to identify young children who are likely to have persisting language difficulties, so that we can recruit them into research, build a strong evidence base and ultimately support them to thrive.

The paper is:

Identifying early language predictors: A replication of Gasparini et al. (2023) confirming applicability in a general population. https://onlinelibrary.wiley.com/doi/10.1111/1460-6984.13086

Loretta Gasparini, Daisy A. Shepherd, Jing Wang, Melissa Wake, Angela T. Morgan

This paper was awarded the International Journal of Language and Communication Disorders 2024 Editors' Prize.

GITHUB LINK:

https://github.com/lottiegasp/languagepredictions

DEMONSTRATOR LINK:

https://storage.googleapis.com/rcslt/Index.html

Please be aware that the views expressed are those of the guests and not the RCSLT.

Transcript Name:

ijlcd-identifying-early-language-predictors-a-replication-of-gasparini-et-al-2023-confirming-applicability-in-a-general-population-cohort

Transcript Date:

3 December 2025

Speaker Key (delete/anonymise if not required):

HOST: JACQUES STRAUSS

LORETTA: LORETTA GASPIRINI

MUSIC PLAYS: 0:00:00-0:00:09

HOST: 0:00:09 Welcome to the RCSLT podcast. My name is Jacques Strauss. This an IJLCD edition of the podcast, in which we talk to authors of papers and the International Journal of Language & Communication Disorders about research that we think will be of interest to the wider SLT community.

We’re joined by Loretta Gaspirini to talk about a paper entitled Identifying early language predictors: A replication of Gaspirini et al (2023) confirming applicability in a general population cohort. This paper looks at whether we can spot children who are likely to have lasting language difficulties before those difficulties become obvious. Earlier research had suggested that a small set of parent reported questions asked when the child was two to three years old could help predict their language abilities later on.

I started by asking Loretta to introduce herself.

LORETTA: 0:01:07 I'm Loretta Gaspirini, I'm a PhD student. I'm based at the Murdock Children’s Research Institute and also the University of Melbourne, Australia. I'm pretty close to submitting my PhD thesis and I'm interested in children with language disorders, how we can identify these children, which I’ll talk about in the study that I'm talking about today, and interested in how we can support these children to really live up to their full potential.

HOST: 0:01:38 What clinical question were you trying to answer?

LORETTA: 0:01:43 We were trying to answer the question, how can we identify young children who are likelier to have persisting language difficulties or disorder, using a very quick approach that’s suitable for population-wide use.

HOST: 0:01:58 What was known about our ability to make these sorts of predictions?

LORETTA: 0:01:43 A really good summary in answering this question is a 2020 paper by Karly McGregor. She published a paper called How we fail children with developmental language disorder. Just a little bit of context. Developmental language disorder, or DLD, is what we call individuals who have everyday difficulties with language that’s not explained by another condition. We think that about 7% of children have DLD and then there might be another 2% or 3% who have a language disorder that is explained by another condition.

In this paper by Karly McGregor she argues that DLD is relatively unknown in the community and it’s relatively under-researched. If we compare this to a few other conditions, like for example, dyslexia, autism, ADHD, there's some evidence that a lot of people in the general population have at least heard of these conditions, but very few people have actually heard of DLD. And this extends also to teachers, and other service providers who aren’t specialists in language, having relatively little knowledge of DLD.

And then if we also look at research that’s done on similar neurodevelopmental disorders, relative to DLD’s prevalence and the impact that it can have on people’s lives, relatively little research is done on DLD, proportionally.

And so McGregor outlines this in her paper and she argues that a consequence of these issues is that children with DLD are really under-served. Many are not identified as having DLD in a timely manner and so they're not given the support and treatment that they really need to help develop their language skills and their communication skills and really live up to their full potential.

McGregor argues that DLD is a hidden condition. It's not easily spotted by parents, teachers and other clinicians who aren’t specialists in language. If we compare this, for example, to a speech sound disorder or a stutter, these conditions might be a little bit easier for just the general person to be able to hear that the person talking just sounds a little bit different from usual. Whereas there aren’t really any such clear cues that you can hear when you're listening to someone with DLD talking.

This makes it harder to identify. And being hard to identify means that it's hard to build an evidence base about how we can identify these young children early in their lives. They do need help, we know this, but it's not obvious enough which children need help. If we can't identify which children at a young age are the ones who really need help, how can we test the effects of these early intervention programmes. That’s really the big challenge here.

And the US Preventive Services Taskforce confirms this. In their most recent report, in 2024, they found insufficient evidence to support screening for speech and language disorders in children five years or younger in the general population. And this was the third time in the last 18 years that they had this same finding of insufficient evidence. Unfortunately, were not really making a lot of progress.

HOST: 0:05:42 I think we’re beginning to have an idea, but can you talk us through what it is your research was designed to address?

LORETTA: 0:05:49 My colleagues and I, we wanted to develop an approach to identifying young children, three years or younger, who were likelier to have persisting language difficulties throughout childhood. Now we had a few key criteria that we really wanted to follow, or achieve, in developing this approach. We wanted this approach to be very fast and simple to administer, to make it easy to potentially administer it across a whole population and to not be leaving out large groups of children, especially children who might be likelier to be unrepresented in services.

Another challenge in developing early identification tools that we wanted to try and overcome is it's quite challenging to identify true cases of both low and typical language outcome. Many existing tools they might go a bit too far one way or the other, they might over-classify many children as having language disorder when actually a lot of them don’t have language disorder or don’t have low language. Or, otherwise they might go the other way and over-classify lots of children as having typical language when there might actually be lots of children in that group who are mis-classified and really do need some extra help with their language. We really wanted to make sure that our tool has good enough accuracy for classifying both groups.

Also, we wanted to make sure that the tool would predict late childhood language. And this is a little bit of a distinction from a lot of other research that’s out there. Two that just looks at establishing short instruments that can identify children based on a concurrent measure of longer instrument of language disorder. But what we outline in the paper is that we’re not actually interested in children who are late to start talking in the preschool years but then go onto develop typical language, maybe in their early school years. These children don’t actually need any extra help, even if they might seem a little bit delayed in the preschool years. This is why we talk in the paper about predicting a late childhood language outcome.

And, finally, key to this particular study that I'm talking about today is that we wanted to make sure that this approach that we develop to early identification could replicate across different samples. Replication is actually, unfortunately, quite rare in our research fields and lots of research fields actually. But it's really important to show that the results that we find in our first research sample is actually going to apply to another sample and then, as an extension to this, hopefully our results will apply to the general population really.

HOST: 0:08:55 When I suggested to Loretta that what we’re talking about is some form of screening, she was, understandably, reluctant to use that word in case we get too far ahead of the evidence. But it's worth drawing out the point here.

If you think about other kinds of screening, let’s say breast cancer, if you design a test to catch as many true cases as possible, you naturally generate more false positives, which leads to extra appointments, further investigations and pressure on services. But if you tighten the threshold to reduce false positives, you inevitably miss some real cases and the consequences of that can be devastating. Sensitivity is about not missing anyone who needs help and specificity is about not worrying people who don’t.

Before we got into the results, Loretta talked me through the design of the study.

LORETTA: 0:09:50 We took data from an existing Australian cohort study, where the individuals, the participants, would be about 22 or 23 years old by now, but the data that we used in this particular study were from just over 1,400 children. They took part in this study from when they were babies until they were teenagers, and some even then into adulthood. When these children were two or three years old, their parents answered various questions about the child’s development, their language development but also, more generally, their communication, their other cognitive behavioural development. Also many questions about the child’s family environment and other environmental factors about how they were growing up. And then, about 8 or 9 years later, when the children were 11 or 12 years old, they did a language test. And that’s what we’re calling the outcome, the outcome that we’re interested in.

Specifically, because I know we’ve got a lot of speech therapists listening, the outcome that these children completed the Clinical Evaluation of Language Fundamentals, or the CELF-4 Recalling Sentences subtest. This is a sentence repetition task, the children hear a sentence and they're asked to repeat them.

The recalling sentences score that we get from this task, it's not, itself, a diagnostic tool for language disorder, but it has been found, across various studies, to have very high agreement with global language skills. For example, to go into a bit of detail, it's been found to have over 95% agreement with the CELF core expressive and receptive scores and these scores can be used as part of a diagnosis of language disorder. So we are using an imperfect measure of classifying the study participants as the low language and the typical language group. It's not a diagnosis of language disorder, but it still gives us a pretty good idea of the children’s late childhood global language skills.

These children who got a low recalling sentences score at 11-12 years are the children we want to identify using those 2-3 measures that were collected 8 years earlier. To apply this to the real world, the children who have low language throughout childhood are the ones we want to identify early so that we can give them extra support.

I’ll go back to a previous study that we conducted, I’ll call this one study one. In study one we used mathematical methods to narrow down from about 2,000 parent reported questions to just 6-8 questions that were pretty effective at identifying low language at 11 years. Now in that earlier paper we identified a few predictor sets, but relevant to this replication study, I’ll call this one study two, we identified a set of six questions that were asked when children were two or three years old. And then, in both study 1 and study 2, we used mathematical methods to estimate how well these sets of 6 questions could predict whether children would have low language at 11 or 12 years.

HOST: 0:13:29 We have two distinct populations and we wanted to see whether these questions were a reasonable predictor of possible DLD later on. What were the findings?

LORETTA: 0:13:41 Overall, between these two cohorts, so in both study one and study two, we found consistent findings. We found that these 6 questions could correctly predict every 7 or 8 out of 10 children who will have low language and every 7 or 8 out of 10 children who will have typical language, 8 or 9 years later, at 11 or 12 years of age. Just to use a bit of technical language to translate into what researchers might be used to reading that as, that’s about 70-80% sensitivity and specificity across two different samples.

HOST: 0:14:22 Could you tell us more about what these questions were?

LORETTA: 0:14:26 The first one was a question from the MacArthur-Bates Communicative Development Inventory, or the MCDI. This is a question about the child’s grammar skills where the parent is asked to tick the sentence that sounds the most like the way their child is talking at that time. And the two options are, this dolly big or this dolly big and this dolly little. This particular question captures the child’s ability to combine clauses with conjunctions, so using the word ‘and’. But, in the MCDI, there's lots and lots of questions about the child’s grammar and this was one question, particularly, that arose as very predictive of the later language outcome.

I suspect that this particular question is just a good snapshot of the broader grammar production skills that we would expect typically developing children at this age to have.

The next four items were all productive vocabulary items, also from the MCDI. These were whether the parent reports that the child is saying the words ‘circle’, ‘accident’, ‘forget or forgot’ and ‘kangaroo’. I thought I'd just add in here that, in a later study, we also showed that this vocabulary item ‘kangaroo’ can be replaced with the word ‘today’. And we get very similar results. It may be that children outside of Australia aren’t learning the word ‘kangaroo’ at the same age as Australian children and just replacing it with the word ‘today’ still has the same level of accuracy.

These four vocabulary items I think really are just providing a snapshot of a child’s vocabulary that we would expect at that age if they're typically developing or at lower risk of lower language outcomes. They're multisyllabic words, we’ve got mostly nouns but there's a verb in there. A very brief snapshot of children’s vocabulary development by that age.

And then the final, the sixth question is, do you have any concerns about how your child behaves? This question is from the Parents Evaluation of Developmental Status or PEDS survey. And this just might be representing global question about the child’s broader development that can tap into maybe broader concerns about their development.

The way I interpret why this might be quite a good predictor might be that, if a child is having difficulties communicating and making themselves understood, that might manifest as what is perceived as problematic or different behaviour and it's really the child trying to make themselves be understood and make themselves be seen.

To summarise, this is six questions that we think parents would only take about one minute to complete, answering those six questions about their child, but that has over 70% accuracy for predicting their language outcome about eight or nine years later.

HOST: 0:18:00 That’s really exciting, we’ve replicated this research with two cohorts, in which we used six questions that, with 70-80% specificity and sensitivity, we are able to predict potential language difficulties eight or nine years later. And, because that’s very exciting, let’s talk about the limitations of the research.

LORETTA: 0:18:27 I think there are two major ones. The first one I have already hinted at a little bit. In this replication study, study 2, the 11-year language outcome that we used was the Recalling Sentences subtest of the CELF Clinical Evaluation of Language Fundamentals, and, like I said, this isn’t a diagnostic instrument, it's just a single subtest of one. The reason for this was really just logistical. The cohort study didn’t administer the whole CELF just, due to time constraints, they were collecting various measures about the child’s development and so that was what we could work with. And, like I said, the Recalling Sentences subtest has excellent but not perfect agreement with the core language score and so we expect that some of these 11-12-year-olds would have been misclassified as either having low language and some as having typical language.

But another detail that I’ll add is that the original study, study one, the one that we now replicated, did use the CELF core language score, which is a more robust measure of a child’s global language skills. And, like I said, the results were consistent between study one and study two, so that puts it into a bit more context.

And then I would say another limitation of the study is that there were relatively few children in both cohorts across study one and study two, who speak other languages at home, languages other than English. It's not clear how well this predictor set would work with multilingual children. In the Australian context, this might include Aboriginal or Torres Strait Islander families, the cohorts were Australian, so we don’t know yet whether these predictor sets would be particularly accurate if they were translated into other languages and collected relating to children learning other languages. I think future work would definitely focus on subgroups of interest to test this further.

HOST: 0:20:44 Presumably, further replication studies would be useful, studies of bilingual children, etc., but what other research would you and your colleagues like to see?

LORETTA: 0:20:54 Like you said, I think there's always room to have further validation or testing the accuracy in other contexts, adding to evidence about how well this predictor set works. But what I'm really excited about this predictor set potentially being used for is to help researchers recruit children into early intervention trials. As I've said before, it's hard to really test early intervention trials because we have so many children that are recruited that we don’t actually know if they're going to end up having typical language or low language, and this study provides us with a really brief way of asking questions to parents, what is a snapshot of your child’s development at the moment? This can give us an idea of the child’s development to help researchers decide, all right, let’s recruit this child into this study because they do seem to have a higher risk of a language disorder and, therefore, we think they might benefit more from intervention.

To this end, we’ve developed a workflow online, on a website called GitHub, where researchers can collect these six questions from parents and generate predictions to decide, for example, who to recruit into their research.

HOST: 0:22:16 Given how quick and simple this set of questions is, it's worth thinking about how your service, trust or school could start gathering this information as part of everyday practice. What would your key takeaways be for SLTs?

LORETTA: 0:22:33 The take-home message I'd really like listeners to take away, is that we’ve established six questions that we can ask parents when their child is two or three years old, that we found can correctly predict every seven or eight out of ten children with low language and every seven or eight out of ten children with typical language and who will have those language outcomes eight or nine years later. And we’ve found that these results replicate across two different samples.

This set of six questions is ready to be used by researchers who want to recruit young children into early intervention trials, where we still need more evidence on how to support these children. And more evidence on how to identify and how to support these young children will help them live up to their full potential and that’s really what we want, that’s really why we’re doing this. [FADES OUT]

MUSIC PLAYS: 0:23:25-0:23:35

HOST: 0:23:34 We’ve now done a number of podcasts about DLD, including how DLD interventions can be embedded into existing school curricular, the views of children with DLD and interviews with adults who have DLD. If you have an interest in this field there are lots of papers in the IJLCD and podcast episodes that would be worth exploring.

In the show notes you will find a link provided by the authors to GitHub, which is a set of resources for researchers. It shows the exact questions, the exact data format, and the code they used to analyse those answers in their studies. It's there so that different research teams can collect the same information consistently. It isn’t a screening tool and it isn’t meant for clinical use.

We’ve also included a link to a demonstrator. It's an interactive way to show what those questions look like in practice. It helps people understand the idea behind the research, it doesn’t calculate risk and it doesn’t classify children and it isn’t linked to the research data. It's just a teaching tool that we’re hosting to help SLTs and educators see the kinds of questions the researchers used.

A very big thank you to Loretta for her time. As always, please do share this podcast to help further research in these important areas. Until next time, keep well.

MUSIC PLAYS: 0:24:53

END OF TRANSCRIPT: 0:25:12