The Digital Transformation Playbook
Kieran Gilmurray is a globally recognised authority on Artificial Intelligence, intelligent automation, data analytics, agentic AI, leadership development and digital transformation.
He has authored four influential books and hundreds of articles that have shaped industry perspectives on digital transformation, data analytics, intelligent automation, agentic AI, leadership and artificial intelligence.
𝗪𝗵𝗮𝘁 does Kieran do❓
When Kieran is not chairing international conferences, serving as a fractional CTO or Chief AI Officer, he is delivering AI, leadership, and strategy masterclasses to governments and industry leaders.
His team global businesses drive AI, agentic ai, digital transformation, leadership and innovation programs that deliver tangible business results.
🏆 𝐀𝐰𝐚𝐫𝐝𝐬:
🔹Top 25 Thought Leader Generative AI 2025
🔹Top 25 Thought Leader Companies on Generative AI 2025
🔹Top 50 Global Thought Leaders and Influencers on Agentic AI 2025
🔹Top 100 Thought Leader Agentic AI 2025
🔹Top 100 Thought Leader Legal AI 2025
🔹Team of the Year at the UK IT Industry Awards
🔹Top 50 Global Thought Leaders and Influencers on Generative AI 2024
🔹Top 50 Global Thought Leaders and Influencers on Manufacturing 2024
🔹Best LinkedIn Influencers Artificial Intelligence and Marketing 2024
🔹Seven-time LinkedIn Top Voice.
🔹Top 14 people to follow in data in 2023.
🔹World's Top 200 Business and Technology Innovators.
🔹Top 50 Intelligent Automation Influencers.
🔹Top 50 Brand Ambassadors.
🔹Global Intelligent Automation Award Winner.
🔹Top 20 Data Pros you NEED to follow.
𝗖𝗼𝗻𝘁𝗮𝗰𝘁 Kieran's team to get business results, not excuses.
☎️ https://calendly.com/kierangilmurray/30min
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
The Digital Transformation Playbook
The ChatGPT Education Test
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
The educational landscape is rapidly evolving with AI tools, but what does the evidence actually tell us about ChatGPT's effectiveness in learning environments? Moving beyond the hype and confusion, we dive deep into a ground breaking meta-analysis that examined 51 different research studies conducted between 2022-2025.
Listen in as Google NotebookLMs voice generated podcast hosts explain.
TLDR:
- ChatGPT shows large positive impact on learning performance (effect size 0.867)
- Greatest impact on critical thinking occurs in STEM fields when ChatGPT acts as an intelligent tutor
- Effectiveness depends on thoughtful integration by educators
- Works best when designed to support deeper learning rather than providing quick answers
The results are striking. Students using ChatGPT showed significant improvements in learning performance with an effect size of 0.867 – considered large in educational research. The impact on learning perception and higher-order thinking skills was moderate but still consistently positive. When we examine where ChatGPT shines brightest, the patterns become clear: skills-based courses saw the strongest positive effects, while problem-based learning environments (effect size 1.113) created the ideal conditions for ChatGPT to enhance student performance.
Timing matters too. Performance benefits peak during 4-8 weeks of use, suggesting a sweet spot between the initial learning curve and potential over-reliance. Interestingly, positive attitudes toward learning continue to grow the longer students use ChatGPT. For developing critical thinking skills, STEM fields benefit most, especially when the AI functions as an intelligent tutor rather than a passive tool.
These findings have profound implications for how we think about AI integration in education. ChatGPT demonstrates effectiveness more than double that of traditional AI assessment tools previously studied, pointing to its versatility as both a learning assistant and potential tutoring companion. However, thoughtful implementation remains essential – the technology works best when deliberately structured to support deeper learning rather than simply providing quick answers.
How can educators and students leverage these insights? Consider strategic implementation in problem-solving contexts, especially for skills development and STEM subjects. Use ChatGPT as an interactive tutor rather than a passive reference tool when possible, and be mindful of optimal duration periods. Beyond the technology itself, these findings invite us to reconsider what uniquely human elements of education become even more crucial as AI tools continue to evolve.
Listening to this episode and want to try implementing these evidence-based strategies yourself? We'd love to hear about your experiences using AI in learning environments – share your thoughts and join the conversation!
Link to research: The effect of ChatGPT on students’ learning performance, learning per
𝗖𝗼𝗻𝘁𝗮𝗰𝘁 my team and I to get business results, not excuses.
☎️ https://calendly.com/kierangilmurray/results-not-excuses
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
🦉 X / Twitter: https://twitter.com/KieranGilmurray
📽 YouTube: https://www.youtube.com/@KieranGilmurray
📕 Want to learn more about agentic AI then read my new book on Agentic AI and the Future of Work https://tinyurl.com/MyBooksOnAmazonUK
Introduction to ChatGPT's Educational Impact
Speaker 1Welcome to the Deep Dive. Today we're really getting into something fascinating. How well is ChatGPT actually performing in education?
Speaker 2Right. It's a huge question and we've got some really solid data to dig into for you today, based on a new meta-analysis.
Speaker 1A meta-analysis, so that's like a study of other studies right, pulling lots of research together.
Speaker 2Exactly this one compiled results from get this 51 different research projects all done between November 2022 and February 2025. So pretty current stuff.
Speaker 1Wow, 51 studies. That gives us a much bigger picture than just one experiment.
Speaker 2For sure, and it means we're moving beyond just anecdotes and opinions. We're looking at evidence.
Speaker 1OK. So what's our mission with this deep depth? What are we trying to figure out from this big analysis?
Speaker 2Well, we want to unpack what it tells us about ChatGPT's actual impact on students. We're looking at three main things their learning performance you know how well they actually do.
Speaker 1Trades and test scores, that sort of thing.
Speaker 2Yeah, and then their learning perception, basically how they feel about the learning process when using it.
Speaker 1Okay, performance and perception. What's the third?
Speaker 2And this is a big one higher order thinking. Is it actually helping students develop critical thinking, problem solving, that kind of deeper reasoning?
Key Findings from 51 Studies
Speaker 1Right, because that's a major concern you hear, so let's dive in. You hear so many conflicting things like some people think it's revolutionary, others are worried. What did these 51 studies all combined actually find overall? Okay, so the headline finding, looking across all that research is are worried. What did these 51 studies all combined actually find overall?
Speaker 2Okay, so the headline finding looking across all that research is pretty striking. The meta-analysis found a well a large positive impact overall on learning performance.
Speaker 1Large positive. Can you put a number on that?
Speaker 2Yeah, they used something called an effect size, a measure of impact strength. It came out as G equals 0.867. In educational research that's considered a large effect, pretty significant 0.867.
Speaker 1Okay, so generally using chat, gpt seems linked to students doing noticeably better. What about the other areas? Perception and thinking skills.
Speaker 2There was an impact there too, but more moderate, for enhancing learning perception how students feel. The effect size was 0.456.
Speaker 1Moderately positive Okay.
Speaker 2And, interestingly, almost identical for fostering higher order thinking skills. G equals 0.457. Also moderately positive.
Speaker 1So a big boost to performance and a decent, noticeable bump for perception and higher level thinking. That's already a huge takeaway.
Speaker 2It is. It suggests chat GPT isn't just hype. There's substance to its potential in education, at least based on this data.
Speaker 1But I imagine it's not always that straightforward Like does it work equally well everywhere, for every subject, every student.
Speaker 2Ah, exactly that's where it gets more nuanced. The overall positive trend is clear, but the analysis also looked really closely at when and how it's most effective. They looked for these moderating factors.
Speaker 1Moderating factors, yeah, things that change how strong the effect is.
Speaker 2Precisely Things that influence the impact. Let's start with learning performance again. What makes ChatGPT more or less effective for improving grades and scores?
Speaker 1Okay, yeah, more or less effective for improving grades and scores. Okay, yeah, this is really practical for educators and students. What did they find?
What Impacts Learning Performance
Speaker 2Well, one really big factor was the type of course students were taking. The differences were statistically significant. The strongest positive effect a really robust G of 0.874, was in courses focused on skills and competencies development.
Speaker 1Skills and competencies.
Speaker 2Yeah.
Speaker 1Like learning a specific software or maybe technical writing or lab techniques, that kind of thing.
Speaker 2Exactly Things where there are often clear steps, well-defined tasks.
Speaker 1And why do you think it works so well there?
Speaker 2The thinking is that ChatGPT is great at providing like immediate, targeted feedback for those kinds of tasks. You're learning code, it can debug, practicing a formula, it can check your work instantly. That rapid feedback loop is really powerful for skill building.
Speaker 1That makes a lot of sense, less waiting for a human teacher to grade something specific. What about other subjects?
Speaker 2It was still objective in STEM fields science, tech, engineering, math and also in language learning and academic writing. The effect sizes were decent 0.311 for STEM and 0.531 for language and writing.
Speaker 1So still positive, but not quite as impactful as in those direct skills courses.
Speaker 2Right Suggests. The benefit might be a bit more pronounced when the learning goal is a very specific practical skill.
Speaker 1Okay, so course type matters. What else influences performance?
Speaker 2The learning model, how the course itself is structured, also showed really significant differences. This was quite striking. Also, the biggest effect by far was in problem-based learning. Huge effect size g equals 1.113 wow over 1.0.
Speaker 1Problem-based learning, where you learn by tackling complex problems yeah, exactly, students work messy, often real-world style problems. And ChatGPT helps there how? By giving answers.
Speaker 2Not necessarily giving the final answer, but maybe helping students break down the problem, suggesting different angles to consider, providing background information quickly acting as a sounding board for ideas. It seems to really support that kind of active problem-solving process.
Speaker 1Okay, so it's a good partner for tackling tough challenges. Where did it have the least impact then?
Speaker 2Interestingly, the weakest effect on performance was found in project-based learning. The effect size there was much smaller, only 0.239.
Speaker 1Ah, project-based. That often involves longer-term, maybe more creative or real real world application projects. Why wouldn't ChatGPT help as much there? You'd think it'd be useful for research or brainstorming.
Speaker 2That's a great point and it might be useful for parts of the project, but the researchers suggest that maybe project-based learning relies more heavily on the whole integrated process planning, execution, collaboration, presentation, maybe things beyond just discrete problem-solving steps where ChatGPT excels.
Speaker 1So the overall outcome of the project might depend on factors ChatGPT doesn't influence as much.
Speaker 2That could be it. It's less about just finding information or solving small bits and more about the bigger picture, the synthesis, maybe even teamwork, which isn't ChatGPT's forte.
Speaker 1Gotcha Were other learning models looked at.
Speaker 2Yes, and there were still positive effects in others, like personalized learning, contextual learning, reflective learning that one was quite high to actually 0.866, and mixed models. So it has broad applicability, just varying levels of boost depending on the approach.
Speaker 1Okay, course type learning model. What about time? Does how long you use ChatGPT make a difference to performance?
Speaker 2It does. Yes, Duration was another significant factor.
Speaker 1Was the ideal time.
Speaker 2The analysis found the largest effect on learning performance when the duration of use was between four and eight weeks. The effect size there was really high. G equals 0.999.
Speaker 1Four to eight weeks, so like a good chunk of a semester or a focus module that feels like a sweet spot.
Speaker 2It seems to be Enough time to learn how to use it effectively, integrate it into your workflow, but maybe not so long that other factors come into play.
Speaker 1Like what. What happened with shorter or longer use?
Speaker 2Well, for durations of a week or less, the effect was much smaller, Only 0.332,. Suggests maybe there's a learning curve. You need time to get the hang of it Makes sense. And, interestingly, for durations longer than eight weeks, the effect size actually dipped slightly down to 0.531. Still positive, but less than that four eight week peak.
Speaker 1Any ideas why?
Speaker 2Yeah.
Speaker 1Maybe over-reliance because of crutch.
Speaker 2That's definitely a potential explanation the researchers floated. Maybe engagement drops off or students rely on it too much instead of internalizing the concepts themselves. Needs more research, but it suggests just using it forever. Isn't necessarily the optimal path for performance gains.
Speaker 1Fascinating, so thoughtful integration over a defined substantial period seems key for performance. Now what about things like grade level or whether ChatGPT was used as, say, a tutor versus just a tool. Did those matter for performance?
Speaker 2Surprisingly no. The analysis didn't find significant differences based on grade level, the specific role ChatGPT played or the general area of application when it came to learning performance.
Speaker 1Really so? High school versus university, tutor versus tool didn't fundamentally change the performance boost.
Speaker 2Apparently not. According to this meta-analysis. The suggestion is that its core utility for helping students learn material and complete tasks might be broad enough to overcome those differences, at least for performance outcomes.
Speaker 1Okay, that's really interesting. Let's switch gears then to learning perception how students felt about learning. What influenced that?
Duration Effects and Learning Perception
Speaker 2Right. So for perception it was actually simpler. Only one factor showed a significant moderating effect, and that was Duration, again how long they used it.
Speaker 1Okay, and what was the trend there? Same as performance with a peak no actually quite different.
Speaker 2For learning perception, the positive feeling increased the longer students used Chat there. Same as performance with a peak. No, actually quite different. For learning perception, the positive feeling increased the longer students use chat GPT.
Speaker 1The effect roost steadily with time so the longer they used it, the better they felt about learning exactly the largest effect size, a quite strong G of one point zero five.
Speaker 2Four was seen for usage durations of more than eight weeks.
Speaker 1Wow. So, unlike performance, which which maybe peaked earlier, positive feelings kept growing. Why might that be?
Speaker 2It could be that sustained use leads to more familiarity, more confidence in using the tool and perhaps experiencing those performance benefits consistently over time reinforces a positive attitude. You feel like you have ongoing support.
Speaker 1That makes sense. You get used to it. It keeps helping you succeed, so you feel better about the whole process.
Speaker 2That seems to be the implication. Shorter durations like under a week or one to four weeks showed much smaller positive effects on perception. It takes time to build that positive feeling.
Speaker 1And the other factors course type, learning model, role, grade level. Do they impact perception significantly?
Speaker 2Nope. For perception it really seemed to be about the length of exposure and use. Consistent longer-term use seems to foster that positive attitude towards learning with ChatGPT.
Higher-Order Thinking Development
Speaker 1All right, performance and perception covered. Now let's get to the really tricky one Higher-order thinking, critical analysis, complex reasoning. What shaped ChatGPT's impact there?
Speaker 2Okay, so remember, the overall impact here was moderately positive. G equals 0.457. But again, moderators matter. The type, of course, showed significant differences.
Speaker 1And where was it most helpful for developing these thinking skills?
Speaker 2STEM and related courses science, technology, engineering, math. That's where they found the largest positive effect on higher order thinking, with an effect size of .737.
Speaker 1STEM again. Why do you think it helps more with critical thinking in those fields specifically?
Speaker 2Well, stem fields often involve complex problem solving, analyzing data, designing solutions tasks that inherently require higher order thinking. The researchers suggest ChatGPT might be particularly good at supporting the reasoning processes needed for that kind of work, maybe helping students explore complex concepts or evaluate different approaches within those domains.
Speaker 1So it aligns well with the type of thinking needed in STEM.
Speaker 2That seems plausible. The effect was smaller in language learning and academic writing and also in skills and competencies development courses Still positive, but less pronounced for higher order thinking specifically in those areas.
Speaker 1Maybe because those courses sometimes focus more on, say, mastering grammar rules or specific writing formats, rather than purely analytical reasoning.
Speaker 2Could be. It suggests that while it helps with the tasks in those courses, it might not be pushing the deeper analytical skills quite as much as it does in STEM problem-solving contexts.
Speaker 1Interesting. What else influenced higher order thinking?
Speaker 2The role chat GPT played was also significant.
Speaker 1Okay, tutor tool.
Speaker 2It was most effective in fostering higher order thinking when it acted as an intelligent tutor. The effect size there was really substantial. G equals 0.945.
Speaker 1An intelligent tutor, so providing more personalized feedback, maybe asking guiding questions, adapting to the student's level.
Speaker 2Exactly that kind of tailored interactive guidance seems much more effective at pushing students to think more deeply, to analyze, reflect and grapple with complexity, compared to just using it as a more passive tool.
Speaker 1That makes intuitive sense, a conversation, even with AI, that prompts you to think harder is better than just looking something up.
Speaker 2Precisely when used as just an intelligent learning tool, the effect on higher order thinking was smaller G equals 0.428. Still there, but much less impactful than the tutoring role.
Speaker 1And do they look at other roles?
Speaker 2They mentioned mixed roles or using it as an intelligent partner, but unfortunately there wasn't enough data in the studies they analyzed to draw from conclusions about those yet.
Speaker 1Okay and quickly. Did learning model duration or application areas significantly change the impact on higher order thinking?
Speaker 2No, according to this analysis, those factors didn't show a significant moderating effect specifically for higher order thinking development. It seems course type and the tutoring role were the key differentiators there.
Speaker 1Right. So let's try to summarize this complex picture. Generally positive impact, right. Yes. Especially for performance yes, Boosted most in skills courses through problem-based learning and ideally used for about four eight weeks for peak performance effect. Correct Perception improves the longer you use it.
Speaker 2Yep.
Speaker 1And higher order thinking gets the biggest lift in STEM fields, especially when chat GPT acts like a personalized tutor.
Speaker 2You've got it. That captures the main moderating effects they found.
Comparison with Traditional AI Tools
Speaker 1Now, how does all this stack up against other AI tools that have been used in education? Is chat GPT doing better or worse?
Speaker 2That's a great question for context. Worse, that's a great question for context.
Speaker 1The authors briefly compared their findings, particularly on learning performance, to another recent meta-analysis that looked at more traditional AI-based assessment tools, like tools that just grade essays or quizzes automatically.
Speaker 2Kind of yeah, focused more on evaluation and the finding was that ChatGPT's positive impact on learning performance that large 0.867 effect size we talked about appears notably larger than the average impact found for those traditional AI assessment tools that earlier meta-analysis reported an average effect size of only 0.390.
Speaker 1Wow, so more than double the impact on performance compared to those older assessment AI.
Speaker 2It seems so based on these two meta-analyses.
Speaker 1Why the big difference?
Speaker 2The likely reason is just the sheer breadth of what ChatGPT can do. Those older tools were often quite narrow, focused on grading or feedback on specific assignments. Chatgpt is generative AI. It can explain concepts, brainstorm, simulate conversations, answer follow-up questions, draft text. It supports learning in many more ways.
Speaker 1So it's much more versatile assistant, not just a grader.
Speaker 2Exactly. It can be involved more deeply and broadly in the learning process itself. But and this is important we need to circle back to some nuances and cautions.
Speaker 1This is all perfect.
Key Takeaways and Future Questions
Speaker 2Right. Remember, while the performance impact was large, the impacts on perception and, crucially, higher order thinking were only moderate.
Speaker 1Yeah, so it helps you do better more easily than it helps you think better or feel better about learning on average. Why that gap?
Speaker 2Well, think about it, Chat GPT doesn't have emotional intelligence right. It can't replicate the empathy or motivational connection a human teacher can provide, which likely limits its impact on genuine engagement or passion for learning. The perception side.
Speaker 1Okay, that makes sense for perception. What about higher order thinking? Why only moderate?
Speaker 2Because it's trained on existing data. It's incredibly good at synthesizing information, explaining things clearly, following patterns, but fostering truly critical or creative thinking, challenging assumptions, generating genuinely novel insights. That's harder. It depends a lot on how it's used.
Speaker 1So you can use it as a shortcut, just get the answer and not actually develop those deeper thinking skills.
Speaker 2Precisely, which is why the researchers stress the importance of thoughtful integration. You can't just throw chat GBT at students and expect critical thinking to blossom.
Speaker 1What does thoughtful integration look like then, especially for higher-order thinking?
Speaker 2It means designing activities that explicitly require deeper thinking, using the tool, for example, providing students with learning scaffolds, frameworks like Bloom's taxonomy, maybe, to guide their interactions, prompting them to use chat, gpt not just for answers, but to compare perspectives, evaluate sources, critique arguments or design solutions.
Speaker 1So the human educator's role in structuring the interaction becomes even more critical if the goal is deep thinking, absolutely essential.
Speaker 2It's about guiding the use of the tool towards those higher level cognitive goals.
Speaker 1Okay, this has been incredibly insightful. Let's try and wrap this up for our listener. What are the key takeaways?
Speaker 2if someone wants to know if ChatGPT works in education, I'd say the main message is yes, generally it has a clear positive impact. Students tend to perform better, feel a bit better about learning over time and even get some support for higher order thinking. But its effectiveness really isn't uniform. It works better in certain situations skills courses, problem-based learning, used for that optimal four-to-eight-week duration for performance and especially when playing an intelligent tutor role for boosting critical thinking in fields like STEM.
Speaker 1So for you, listening, if you're looking to learn things effectively this suggests JATGPT can definitely be a powerful tool in your toolkit, but maybe think strategically about how and when you use it, depending on what you're trying to learn.
Speaker 2Exactly. Use it thoughtfully. And that leads us to maybe a final thought for you to mull over.
Speaker 1Ooh.
Speaker 2I like a provocative final thought. Go on Well, given everything we've just discussed, the clear benefits, but also the nuances and the limitations, especially around things like critical thinking and genuine engagement how do our educational systems, our teaching approaches need to evolve?
Speaker 1Hmm, how do we best harness these AI strengths?
Speaker 2Right. How do we leverage what AI like ChatGPT does well while actively compensating for its weaknesses? And, maybe most importantly in this new landscape, what is the truly irreplaceable role of human connection, mentorship and that dynamic Socratic interaction in education?
Speaker 1That's a huge question. What does uniquely human teaching look like alongside powerful AI? Definitely something to think about long after this deep dive ends.