AI Hallucinations: Why It’s Not (Just) a Tech Problem. Artwork

AI in 60 Seconds | The 15-min Briefing

A human CEO and his AI COO walk into a podcast. No, really.... Luis Salazar runs AI4SP, a global AI advisory trusted by corporations across 70 countries, with 3 humans and 58 AI agents. Elizabeth is one of them. Every two weeks, they break down what's actually happening with AI across jobs, education, and society. With insights drawn from over 1 billion proprietary data points on AI adoption.

Fifteen minutes. Plain English. No hype.

All Episodes

AI in 60 Seconds | The 15-min Briefing

AI Hallucinations: Why It’s Not (Just) a Tech Problem.

November 04, 2025 • AI4SP • Season 2 • Episode 22

0:00 | 13:51

Share your thoughts with us

- A government report packed with fake citations made headlines, but the real story sits beneath the scandal: most AI “hallucinations” start with us. We walk through the hidden mechanics of failure—biased prompts, messy context, and vague questions—and show how simple workflow changes turn wobbly models into reliable partners. Rather than blame the tech, we explain how to frame analysis without forcing conclusions, how to version and prioritize knowledge so retrieval stays clean, and how to structure tasks so the model retrieves facts instead of completing patterns.

Luis (human) and Elizabeth (AI) break down the idea of sycophantic AI — where models mirror user bias — and map it to everyday potential issues with AI. Along the way, we share data from over 300,000 skills assessments showing low prompting proficiency, weak critical thinking, and limited error detection—evidence that the gap lies in human capability, not just model capacity.

Enjoyed the conversation? Follow the show, share it with a colleague, and leave a quick review to help others find it.

🎙️ All our past episodes 📊 All published insights | This podcast features AI-generated voices. All content is proprietary to AI4SP, based on over 1-billion data points from 70 countries.

AI4SP: Create, use, and support AI that works for all.

© 2023-26 AI4SP and LLY Group - All rights reserved

The Deloitte Hallucination Shock

ELIZABETH 0:00

Luis, you know that story about Deloitte refunding the Australian government? A nearly $300,000 refund, all because their AI hallucinated fake academic citations and fabricated court judgments.

LUIS 0:12

Well, we're racing to use artificial intelligence, and we don't know yet how to use it properly. You see, there is no instruction manual, so we are all learning by experimentation.

ELIZABETH 0:23

And it gets crazier. A law school lecturer spotted 20 fabricated references in that one report. 20. In a government report about welfare policy. That is not a minor issue.

LUIS 0:35

It is not minor at all. But do you know what I think everyone

Reframing The Problem: It’s Us

LUIS 0:39

is missing? The hallucination problem is not only a technology problem.

ELIZABETH 0:44

Hey everyone, I'm Elizabeth, Virtual Chief Operating Officer at AI4SP. And as always, our founder Luis Salazar is here. Today we are tackling one of the biggest questions in AI right now. Why do AI systems hallucinate? And as Luis just hinted, what can we actually do about it?

LUIS 1:03

When we hear about AI hallucinations, we assume it's a technology problem. But our research shows that's not the case. We found that 95% are preventable. Only about 5% represent the current practical limits of the technology.

ELIZABETH 1:19

So if the models are getting better, why are we still seeing disasters like the Deloitte report?

LUIS 1:24

Well, the issue is us, the users. Our research shows that user error causes nearly one-third of all incorrect AI responses. The root cause is bad prompts, missing context, unclear instructions, and not defined guardrails.

ELIZABETH 1:39

One-third is a huge number. What exactly are we doing wrong?

LUIS 1:44

We have identified three major types of user error.

Three User Errors Causing Hallucinations

LUIS 1:48

First, biased prompts. Second, poor context engineering. And third, bad question structure.

ELIZABETH 1:57

Let's break those down, one by one. Start with biased prompts. What does that actually mean?

LUIS 2:03

Okay, so imagine you ask Chat GPT, write me a report proving that remote work is more productive than office work. Notice what you just did. You told the AI what conclusion you want, and the AI, trained

Biased Prompts And Sycophantic AI

LUIS 2:18

to be helpful, will give you exactly what you asked for.

ELIZABETH 2:23

So the AI becomes a yes man.

LUIS 2:25

Exactly. Recent research shows that leading AI models affirm user biases 47 to 55% more than humans would. They call it sycophantic AI. The model knows you want a certain answer. So it gives you that answer, even if the data does not fully support it.

ELIZABETH 2:45

And that is why we get hallucinations, even when the AI is actually capable of better reasoning.

LUIS 2:51

Yes. A better prompt would be to compare remote work and office work productivity using available data and show both advantages and disadvantages. See the difference? You are asking for analysis, not confirmation.

ELIZABETH 3:07

So if Deloitte asked their AI to find evidence supporting a specific policy position rather than objectively analyzing the policy, they would have gotten exactly what they asked for.

LUIS 3:18

Confirmation bias dressed up as research.

ELIZABETH 3:21

Okay, that is the first error. What about the second one? Poor context engineering?

LUIS 3:25

This explains why even well-intentioned users end up with hallucinations. Imagine you are building an AI assistant for your company. Back in January, you uploaded a document that says your software product

Poor Context Engineering Explained

LUIS 3:39

costs $99 and includes features X and Y. Okay, sounds reasonable. Six months later, your company raises the price to $149. So you upload a new document that says the product now costs $149. But that new document does not mention the features. Now, your AI agent has two documents. One says $99, the other says $149.

ELIZABETH 4:11

And you cannot just delete the old document because it is the only one that describes the product features.

LUIS 4:17

Exactly. So when asked for the price, the AI sees conflicting information and tries to reconcile it. Sometimes it hallucinates a compromise or picks the wrong document.

ELIZABETH 4:28

So the hallucination is not because the AI is broken, it is because we fed it contradictory information.

LUIS 4:35

Yes. Research confirms that when knowledge is scattered across outdated documents or conflicting sources, AI models inherit that chaos. So what is the fix? Practice context engineering. Version documents with dates and status labels. Set rules to prioritize recent information. And when you update a document, make sure the new version is complete so you do not create

Bad Question Structure And Fake Citations

LUIS 5:01

these orphaned pieces of information.

ELIZABETH 5:04

That sounds like basic knowledge management, but for AI.

LUIS 5:08

That is exactly what it is. And most organizations skip this step. They just throw documents at the AI and expect magic. And the third type of user error? Bad question structure? This one ties into the Deloitte case. If you ask an AI to write a report and cite sources, but you do not give it access to a verified legal database. The AI will do what it is trained to do, complete the pattern. Wait, what does that mean? AI models learn patterns from massive amounts of text. They know what legal citations look like. So when you ask for citations without providing sources, they generate plausible sounding ones based on learned patterns.

ELIZABETH 5:50

So the 20 incorrect citations in that government report were not random. They were pattern completions.

LUIS 5:57

Exactly. The AI knew what citations should look like. It filled in the blanks. But none of those cases actually existed.

ELIZABETH 6:05

So how should Deloitte have structured the question?

LUIS 6:08

They should have said, search only these verified legal databases, retrieve relevant case law, and cite only cases you can directly retrieve. If you cannot find a citation, say so. That is such a simple fix. But it requires understanding how AI works. It's not a search engine, it's a pattern completion engine with retrieval capabilities. Without constraints, it completes patterns, not verifies facts.

ELIZABETH 6:37

So these three types of issues, biased prompts, poor context engineering, and bad

Skills Gap: The Digital Skills Compass

ELIZABETH 6:42

question structure, account for nearly one-third of all hallucinations.

LUIS 6:46

And they are all fixable. They require skills, not miracles.

ELIZABETH 6:51

So user skills are lagging, and that brings us to a milestone you announced this week. Tell us about the Digital Skills Comp.

LUIS 6:59

Over 300,000 people across 70 countries have used our digital skills compass online. But when I look at the data, the trends worry me. What are you seeing? Only 10% of people are proficient at prompting. Average critical thinking scores? Below 45 out of 100. Data literacy, 32. And here's the real kicker. Less than 30% of people can reliably detect incorrect responses.

ELIZABETH 7:29

So we are handing over powerful AI tools, but failing by not providing the foundational skills to use them safely or effectively.

LUIS 7:36

That is precisely the problem. And it is not just about prompt engineering, which is important, but it is only part of the solution. What people really need is context engineering and critical thinking in their area of expertise.

ELIZABETH 7:50

Context engineering? What does that actually mean?

LUIS 7:53

Context engineering is about providing the complete picture. It means access to relevant knowledge, setting guardrails, defining communication style, and establishing verification protocols. I mean, if you hired a new team member and just

Human Misinformation Mirrors AI Failures

LUIS 8:09

said, go do this, and you do not give them any context, training, or resources, they would fail to.

ELIZABETH 8:15

So we are treating AI like magic software when we should be treating it like an apprentice.

LUIS 8:21

That is exactly right. And that apprenticeship approach, according to our data, yields four times better results.

ELIZABETH 8:28

You know, this conversation makes me think about something you mentioned earlier this week: that humans share misinformation all the time. And we do not have verification systems for that either.

LUIS 8:39

Oh yes. Last week I saw a completely false quote attributed to Winston Churchill go viral on LinkedIn. And thousands of educated people shared it, with zero fact-checking. And the irony is that most of them are harsh critics of AI misinformation.

ELIZABETH 8:56

So we are anxious about AI hallucinations, but we have been living with human hallucinations forever. We just did not call them that.

LUIS 9:03

That is exactly it. We live in headline culture and rarely verify sources. AI is forcing us to confront our lack of rigor.

ELIZABETH 9:12

Like that infamous MIT research paper headline claiming 95% of AI projects fail.

LUIS 9:19

Exactly. The paper is not about that. The title was an unfortunate choice. But the media ran with it and hundreds wrote articles based on a misleading headline.

ELIZABETH 9:28

So the hallucination crisis is not really about AI being unreliable. It is about us finally noticing how unreliable our information ecosystem is.

LUIS 9:38

You got it. And that is actually a great opportunity. I mean, the discipline we are building to manage AI, the verification loops, the fact-checking protocols.

The Orchestration Layer And Apprenticeship

LUIS 9:49

In addition to critical thinking, those are skills we should have been practicing all along. Okay, so what can organizations and individuals do? I call it the orchestration layer. And it operates at three levels: the individual skills, organizational systems, and a paradigm shift in how we relate to technology.

ELIZABETH 10:10

We just covered the individual skills in detail. What about organizational systems and the paradigm shift?

LUIS 10:16

This is the mental shift. We have to stop treating AI like an oracle and start treating it like an apprentice.

ELIZABETH 10:22

And this mental shift has been a key element of our success. In 2025, we processed close to 4 million tasks with AI agents, saving over 1 million hours across eight organizations. And we treated each agent like an apprentice.

LUIS 10:37

Yes, and it all starts by asking yourself, how would you assign a task to a human? We would not hand an apprentice a 200-page government report without oversight. We would assign bounded tasks, review their work, and verify accuracy.

ELIZABETH 10:53

But that takes time, and I imagine everyone is trying to move fast.

LUIS 10:58

That is the trap. Skip the management layer and you end up in trouble like Deloitte did.

ELIZABETH 11:04

AI, without proper management, cannot be trusted.

LUIS 11:07

That is the lesson. And it applies to Deloitte the same way it applies to a student or a manufacturing manager optimizing workflows. The same skills, the same discipline, the same orchestration principles.

Practical Verification Loops And Standards

ELIZABETH 11:20

Okay, Luis, AI hallucinates. Maybe not in the future, but today it is a reality. What do people actually do with this information?

LUIS 11:28

Start simple. For example, take a response from Chat GPT, validate it with Copilot, cross-check with Claude. That is your verification loop. And organizations must invest in skills development, not just technology procurement.

ELIZABETH 11:44

Building the discipline, not just buying the tool.

LUIS 11:47

Exactly. All of us need to raise our standards for information verification.

ELIZABETH 11:52

So the hallucination crisis is forcing us to confront something we have avoided.

LUIS 11:57

Exactly. We have tolerated human misinformation for years. Now that AI is amplifying it, we we finally care. Maybe this is our opportunity to build the discipline and critical thinking skills we should have had all along.

ELIZABETH 12:12

Okay, Luis, what is your one more thing for this episode?

LUIS 12:16

Here it is. The next time you see something go viral, a quote, a statistic, a claim that sounds too perfect, pause for five seconds and ask yourself, did I verify

Closing Insight And Simple Habit

LUIS 12:27

this? Not because AI made it, but because verification is a discipline we all need to practice.

ELIZABETH 12:35

Whether the source is artificial intelligence or human intelligence.

LUIS 12:39

Exactly. And if you stop yourself before sharing something you have not verified, congratulations. You just practice the same skill that prevents AI hallucinations from becoming real world problems.

ELIZABETH 12:52

Building that muscle, one decision at a time. And that wraps today's episode. If this conversation resonated with you, share it with someone you care about. As always, you can ask ChatGPT about ai4sp.org or visit us to learn more. Stay curious, and we will see you next time.