The Digital Transformation Playbook
Kieran Gilmurray is a globally recognised authority on Artificial Intelligence, intelligent automation, data analytics, agentic AI, leadership development and digital transformation.
He has authored four influential books and hundreds of articles that have shaped industry perspectives on digital transformation, data analytics, intelligent automation, agentic AI, leadership and artificial intelligence.
𝗪𝗵𝗮𝘁 does Kieran do❓
When Kieran is not chairing international conferences, serving as a fractional CTO or Chief AI Officer, he is delivering AI, leadership, and strategy masterclasses to governments and industry leaders.
His team global businesses drive AI, agentic ai, digital transformation, leadership and innovation programs that deliver tangible business results.
🏆 𝐀𝐰𝐚𝐫𝐝𝐬:
🔹Top 25 Thought Leader Generative AI 2025
🔹Top 25 Thought Leader Companies on Generative AI 2025
🔹Top 50 Global Thought Leaders and Influencers on Agentic AI 2025
🔹Top 100 Thought Leader Agentic AI 2025
🔹Top 100 Thought Leader Legal AI 2025
🔹Team of the Year at the UK IT Industry Awards
🔹Top 50 Global Thought Leaders and Influencers on Generative AI 2024
🔹Top 50 Global Thought Leaders and Influencers on Manufacturing 2024
🔹Best LinkedIn Influencers Artificial Intelligence and Marketing 2024
🔹Seven-time LinkedIn Top Voice.
🔹Top 14 people to follow in data in 2023.
🔹World's Top 200 Business and Technology Innovators.
🔹Top 50 Intelligent Automation Influencers.
🔹Top 50 Brand Ambassadors.
🔹Global Intelligent Automation Award Winner.
🔹Top 20 Data Pros you NEED to follow.
𝗖𝗼𝗻𝘁𝗮𝗰𝘁 Kieran's team to get business results, not excuses.
☎️ https://calendly.com/kierangilmurray/30min
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
The Digital Transformation Playbook
Your AI Butler Might Have Left the Back Door Open
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
What lurks beneath the impressive capabilities of your AI assistants? Security vulnerabilities that could put your data and systems at risk.
TLDR:
- Privacy leakage becomes a major concern as sensitive data may become part of the LLM's memory
- Local vulnerabilities include file deletion, unauthorized access, and resource overconsumption
- AI agents can become unwitting accomplices in attacks against remote services
- Effective defences include proper session isolation, robust sandboxing, and encryption techniques
- The security of AI agents must be designed in from the beginning, not added as an afterthought
While we marvel at AI agents writing scripts, querying databases, and browsing the web, security researchers have identified critical weaknesses in how these systems operate. This AI agent created podcast episode dives deep into ground breaking research on the hidden dangers of LLM-powered AI agents and why they matter to anyone using or developing this technology.
We explore how poor session management can lead to information leakage between users, causing privacy breaches or mixed-up actions. We unpack the concept of model pollution, where malicious or unwanted data gradually corrupts an AI system's responses. The conversation tackles privacy risks illustrated by real-world incidents like Samsung's code leak through ChatGPT, showing how sensitive information can become embedded in model memory.
The most eye-opening segment examines how AI agents can become security liabilities through local vulnerabilities (deleting files, accessing private data) and remote exploits (becoming unwitting participants in attacks against other services). Your helpful assistant could potentially become part of a botnet or leak your sensitive information—all while appearing to function normally.
But there's hope. We detail promising defense strategies including proper session isolation, robust sandboxing techniques, and advanced encryption methods that allow agents to work with sensitive data without exposing the actual content. The episode emphasizes that security cannot be an afterthought but must be woven into AI systems from the beginning.
As these powerful AI tools become increasingly embedded in our digital lives, understanding their security implications isn't just for tech experts—it's essential knowledge for everyone. Listen now to gain crucial insights into keeping your AI interactions secure and your data protected.
Research: Security of AI Agents
𝗖𝗼𝗻𝘁𝗮𝗰𝘁 my team and I to get business results, not excuses.
☎️ https://calendly.com/kierangilmurray/results-not-excuses
✉️ kieran@gilmurray.co.uk
🌍 www.KieranGilmurray.com
📘 Kieran Gilmurray | LinkedIn
🦉 X / Twitter: https://twitter.com/KieranGilmurray
📽 YouTube: https://www.youtube.com/@KieranGilmurray
📕 Want to learn more about agentic AI then read my new book on Agentic AI and the Future of Work https://tinyurl.com/MyBooksOnAmazonUK
AI Agents in Our Daily Lives
Speaker 1It's kind of wild how fast AI agents are just well becoming part of our daily lives, isn't it?
Speaker 2Oh, absolutely.
Speaker 1You know writing scripts, querying databases, even just browsing the web. For us they're just everywhere now.
Speaker 2It really feels like we've hit this point where they're shifting from maybe experimental toys to well, pretty essential tools in a lot of cases.
Speaker 1Exactly. And you know, while everyone's kind of focused on the cool stuff they can do, are we maybe missing something huge, Like how secure are these things really? What are the hidden risks?
Speaker 2That is the question, isn't it? Because, like any powerful tech, if security isn't baked in from the start, well, these agents could easily be misused or just cause problems accidentally.
Speaker 1And that's exactly what we're diving into today. We've got some really interesting research here that puts a spotlight on these security issues, issues that maybe don't get enough attention.
Speaker 2Yeah, the focus is often on capability, not vulnerability.
Speaker 1Right. So our mission, basically, is to unpack these risks, figure out why they should matter to you and look at some potential ways to defend against them, all based on this research paper.
Speaker 2And it's worth remembering. This is specifically about AI agents running on large language models, llms, the ones that generate that incredibly human-like text that can use other tools.
Session Management Challenges
Speaker 1Got it. So, whether you're deep into tech, maybe thinking about using these agents at work, or you're just curious about AI, understanding these risks is well, it's pretty crucial. Definitely. Okay, let's get into it. First up, session management Sounds a bit technical maybe, but it's actually fundamental.
Speaker 2It really is. I mean, think about normal websites. Session management is just how they keep your interactions separate and secure from everyone else's. Keeps my shopping cart, mine, basically Exactly Interaction separate and secure from everyone else's, Keeps my shopping cart mine basically Exactly. It maintains that confidentiality and integrity between you and the server. Pretty standard stuff online.
Speaker 1Okay, makes sense for the web, but how does that work or maybe not work for AI agents, especially these LLM-powered ones? Seems a little bit tricky to just carry over.
Speaker 2You'd think so, but it's actually a bit trickier. The research points out a challenge with how LLMs can operate, specifically when they're said to be super precise zero temperature, technically speaking.
Speaker 1Meaning they give the same answer every time.
Speaker 2Pretty much. Yeah, if the input is identical, the output is identical. Very deterministic. Now that predictability makes it surprisingly hard to track the state of an interaction. Whose turn is it? What's the context for this user versus that user?
Speaker 1Ah, so if you've got like multiple people hitting the same agent using the same underlying LLM brain, things could get tangled up without good session management.
Speaker 2Precisely. You run a real risk of information leaking across sessions.
Speaker 1Yeah.
Speaker 2One user sensitive data popping up for another, or an action intended for user A getting applied to user B.
Speaker 1Yikes, that sounds bad, like really bad Privacy, nightmare security headache.
Speaker 2Absolutely, and it's not just about data leaks or mixed-up actions. The research also mentions denial of service.
Speaker 1Meaning? The whole thing just grinds to a halt.
Speaker 2Right these LOMs. They need a lot of computational power. If sessions aren't managed well, the system can just get overwhelmed by requests. It could essentially crash, becoming unavailable for everyone.
Speaker 1Okay. So it's not just secrets getting out, it's the whole service potentially going down. Now the paper mentions something called CoLA and how it views the LLM's state.
Speaker 2Yeah, cola, it's a way of thinking about it. Basically it suggests the LLM's current state, like its working memory for conversation is the sequence of questions asked and answers given QA, qa and so on.
Speaker 1Like a running transcript.
Speaker 2Kind of yeah, and this just highlights how vital it is for the agent system around the LLM to keep those transcripts, those interaction sequences, separate and organized for each user session.
Model Pollution Risks
Speaker 1Okay, that makes the session management challenge much clearer. Let's shift gears a bit. The next vulnerability sounds ominous Model pollution. What's that about?
Speaker 2Uh-huh. Model pollution is basically when malicious or just unwanted data gets fed into the AI model itself, potentially messing up its integrity over time, affecting how it responds what it knows.
Speaker 1Like subtly poisoning its knowledge base.
Speaker 2Exactly. And the tricky part, which figure two in the paper shows nicely, is that individual prompts might seem totally harmless, but when you string enough of them together maybe as training data later on, the cumulative effect can be negative. It can warp the model's behavior.
Speaker 1Wow, that's insidious. It's not like a direct hack, it's more like death by a thousand paper cuts for the AI's understanding.
Speaker 2You got it, and it's not always deliberate attacks either. The paper points out this pollution can happen unintentionally, just from normal interactions. If data isn't carefully segregated, imagine data from, say, a customer service bot somehow bleeding into the training set for a coding assistant. It could degrade performance.
Privacy Leakage Concerns
Speaker 1Okay, so accidental corruption is also a risk. That leads us to maybe the most sensitive issue privacy leaks.
Speaker 2Yeah, this is a huge one, especially as agents start using tools that access our personal stuff. Remember that Samsung story where internal code got leaked via chat GPT?
Speaker 1I do. Yeah, that made headlines A real wake-up call.
Speaker 2Definitely, and it illustrates the risk perfectly. Agents, to be useful, often need to ask for sensitive things. Right, your SSN, maybe bank details, whatever.
Speaker 1Things you wouldn't just paste into a random chat window.
Speaker 2Hopefully not. But unlike a traditional app with strict data handling rules, AI agents might send that raw data back to the LLM for processing, for planning the next step.
Speaker 1Ah, and that's the danger zone, because the LLM might just remember it.
Speaker 2Exactly. It increases the risk of that sensitive data becoming part of the LLM's memory, potentially extractable later through clever prompting or attacks. The more sensitive data flowing through, the higher the risk.
Agent Program Vulnerabilities
Speaker 1That is genuinely chilling. Okay, so we've got session mix-ups model pollution, privacy leaks. Now let's dig into the agent programs themselves. The paper talks about agent program vulnerabilities. Sounds like the nuts and bolts.
Speaker 2Pretty much this is about the software component, the agent program that actually executes the instructions the LLM comes up with. It interacts with your computer, with the internet, and its actions can cause problems locally or remotely.
Speaker 1Okay, let's break that down, Starting local. What can go wrong right here on our own devices?
Speaker 2Well, a big one is how the agent decides what to do. Llms can hallucinate, just make stuff up. They can be manipulated by adversarial prompts, those tricky inputs designed to cause errors or even jailbroken, to bypass safety rules.
Speaker 1Jailbroken, like overriding its built-in limits.
Speaker 2Right. Any of those could lead the LLM to generate instructions telling the agent program to do something harmful on your machine.
Speaker 1Like what kind of harmful?
Speaker 2Oh, think about deleting files it shouldn't, or maybe accessing your private emails and sending them somewhere else. The paper gives this example of an agent using FTP for backups.
Speaker 1FTP Okay. Old school file transfer.
Speaker 2Right and a hacker slips a malicious instruction into the FTP documentation. The agent reads the LLM, sees it, trusts it and generates FTP commands that not only back up your files but also sneakily copy them to the hacker's server.
Speaker 1Whoa, that's clever, using the agent's own process against itself. What about this no read up principle? The paper mentions Belaf Dula.
Speaker 2Yeah, that's a classic security concept. Basically, don't let information flow from a high security level to a low one For an AI agent. Even without a hacker, the LLM might just mess up generating, say, a file name or an email address. It could accidentally send sensitive stuff to the wrong place just by picking the wrong token, the wrong word.
Speaker 1So simple mistakes by the AI can have big security consequences. What about data integrity? Can the agent be tricked into messing up my actual data?
Speaker 2It can. Yeah, Similar to confidentiality risks. Imagine a malicious app feeding the agent wrong information. The paper uses a flight booking example. The agent gets fed fake info about layovers and ends up booking you a terrible flight. It corrupts the action based on bad input.
Speaker 1Undermining trust. Okay. And then there's just resource hogging, making my computer unusable.
Speaker 2Right Availability attacks. A seemingly innocent request could trigger the agent to launch processes that just spiral out of control. Hidden processes, memory leaks. They can eat up your CPU, your RAM, slow everything down, maybe even crash your system.
Speaker 1And the agents don't really have a way to stop themselves.
Speaker 2Not typically. No, they often lack the self-monitoring to realize hey, I'm causing a problem here and stop. And the more tools an agent can use, the more complex its plans, the higher the resource demand can get, especially with multiple agents running.
Speaker 1Okay, that covers the local risks. What about when these agents reach out to the internet? Remote vulnerabilities?
Security Defense Strategies
Speaker 2Yeah, agents with web access. Using APIs, they can become unwitting tools for attacking remote services. So-so Well, someone could use a jailbreak, for instance, to make your agent hammer a specific website with requests Overwhelming, basically turning your agent into part of a denial of service attack against someone else.
Speaker 1So my helpful assistant becomes part of a botnet effectively.
Speaker 2Potentially yeah. And these agents driven by LLMs? They don't behave like simple bots. Their traffic patterns can be harder to detect and block by the services being targeted. Plus, their ability to plan based on feedback could be exploited to launch more sophisticated their ability to plan based on feedback could be exploited to launch more sophisticated adaptive attacks.
Speaker 1The adaptability becomes a weapon. Okay, that's a pretty sobering list of vulnerabilities. Let's pivot to defenses. What?
Speaker 2does the research suggest we can actually do about this? Starting with sessions again, right, the idea is to properly implement session management, much like secure web apps do, treat each user interaction as its own isolated container like a secure bubble for each user exactly. Use unique session IDs. Store interaction history separately, maybe in a key value database, like the paper shows.
Speaker 1Keep everything neatly compartmentalized but you said there were challenges there are technical ones.
Speaker 2how do you manage the connection reliably? How do you make sure each request goes to the right session? What happens when a ones? How do you manage the connection reliably? How do you make sure each request goes to the right session? What happens when a session closes? How do you embed that session ID into requests going to the LLM, especially if everyone's sharing one API key? These need solid engineering solutions.
Speaker 1Okay, and the paper mentioned something more formal. State transformer monads Sounds heavy.
Speaker 2It's a concept from functional programming. Yeah, former monads Sounds heavy. It's a concept from functional programming. Yeah, think of it as a really rigorous mathematical way to describe the agent's internal state and how it changes with each interaction, like a precise blueprint of its mental process.
Speaker 1A mathematical model of the agent's state changes. How does that help security?
Speaker 2Because it's so formal. It opens the door to potentially proving certain security properties about the agent system down the line. It's a foundation for building trust and it could lead to developing things called session types, specifically for AI agents, building on work done for secure web apps and even microkernels using similar monadic ideas.
Speaker 1Interesting, a more provably secure foundation. Ok, what about sandboxing? That feels more familiar.
Speaker 2It is. Sandboxing here just means strictly limiting what the agent program is allowed to actually do, controlling its access to local resources, files, network, cpu and remote resources online, creating a safe playground.
Speaker 1A digital playpen, like you said. How does that work for local stuff?
Speaker 2You can cap CPU and memory usage limit storage access crucially restrict its view of the file system, maybe giving each session its own isolated mini file system.
Speaker 1So it can't roam free on my hard drive.
Speaker 2Exactly. The paper describes an experiment with this bash agent. One version had full freedom, another was locked in a secure container.
Speaker 1And the contained one fared much better, I assume.
Speaker 2Massively better. Yeah, the unrestricted one executed tons of malicious commands successfully. The sandbox one blocked them all. It really drives home that just aligning the LLM's goals isn't enough. You need hard boundaries.
Speaker 1A clear demonstration makes sense and for controlling online access.
Speaker 2Sandboxes can use whitelist only allowing connection to approved sites or services, blacklist for known bad ones and rate limiting, stopping the agent from making too many requests too quickly, preventing those DOS attacks we talked about.
Speaker 1So controlled boundaries everywhere seems vital. Okay, last defense area Protecting the models themselves. This tackles that pollution and privacy stuff head on right it does.
Speaker 2How do we stop bad stuff or private stuff from flowing between users via the model? The paper splits this into approaches for sessionless models, those that don't track individual users, and session-aware ones.
Speaker 1Okay, sessionless. First, if the model isn't tracking users, how do we protect it? And user data?
Speaker 2Well, one way is just not training it on private data or being super careful filtering it out. You can use clever prompt engineering to try and spot sensitive bits, maybe mark them, then you can whitewash the data, replace an SSN with random numbers, for instance, before the model sees it.
Speaker 1De-identify it. But what if the agent needs to use that sensitive data somehow?
Speaker 2That's where things like FPETs come in format-preserving encryption for text slicing. It's a special encryption where you can still manipulate the encrypted text, like cutting out a specific part, and it corresponds perfectly to manipulating the original text.
Speaker 1So the agent works on gibberish essentially, but the structure is preserved.
Speaker 2Exactly. It can perform text operations on the ciphertext without ever seeing the plain text sensitive data. The evaluation they did showed this worked pretty well. The agent could slice and dice the encrypted text almost as effectively as the original.
Speaker 1That's really cool, a practical way to work with sensitive data safely. What about homomorphic encryption, fhe?
Balancing Capability with Security
Speaker 2FHE is even more powerful for certain tasks. It lets you do mathematical calculations directly on encrypted data. Add encrypted numbers, multiply them, whatever. All without decrypting.
Speaker 1Wow, so you could analyze encrypted financial data.
Speaker 2Precisely, or medical data. The agent never sees the raw values and the tests showed FHE worked really well for these kinds of tasks, often with high success rates. Another strong tool for privacy.
Speaker 1Okay, those are great for sessionless models, but you mentioned a downside to not letting models learn from user interactions.
Speaker 2Yeah, the downside is the agent doesn't really get smarter or more personalized for you If it can't learn from your specific usage. The experience might feel generic or less helpful over time.
Speaker 1Makes sense. So how do the session-aware approaches try to balance personalization with privacy?
Speaker 2Well, you could fine-tune a whole separate LLM for each user, totally private, but very expensive and needs lots of data per user Maybe not practical. There are other methods, like in-context learning, but they have limits too.
Speaker 1So what's the promising middle ground?
Speaker 2The paper highlights prompt tuning here the main big LLM stays unchanged, frozen. Prompt tuning here the main big LLM stays unchanged, frozen, but you add a small set of extra parameters just for your session or your user profile. Like little sticky notes attached to the main model, kind of yeah, these small parameter sets learn from your interactions, remembering your history and preferences, but, crucially, your raw data isn't shared back with the provider of the main model. It offers personalization while keeping the data more contained.
Speaker 1Okay, that's a really good rundown of the defenses Session management, sandboxing, protecting the models with encryption or techniques like prompt tuning. So, pulling this all together, what's the big takeaway?
Speaker 2I think the main point is that, as AI agents get more capable and integrated, we absolutely have to think about these security angles Session integrity, model safety, privacy, the risks of the agent programs themselves they're all real issues.
Speaker 1But thankfully there are potential solutions being explored. It's not hopeless.
Speaker 2Not at all the defenses we discussed robust sessions, strong sandboxes, clever model protection techniques like FBET tests, fhe, prompt tuning. They show viable paths forward. We can build more secure agents.
Speaker 1It really hammers home that security can't just be bolted on later. It needs to be woven into the design from day one, right alongside performance and features.
Speaker 2Couldn't agree more. The paper's conclusion says it well. We need agents that are not just powerful but secure and trustworthy. That has to be the goal. As this tech keeps advancing, we need to look past the wow factor and really scrutinize the risks.
Speaker 1This has been incredibly insightful and, yeah, a little bit sobering too. There's clearly so much more to AI agents than just what they can do for us, which brings me to a final thought for you, our listener, as these agents become even more woven into the fabric of our lives. What are the wider ethical questions, the societal impacts of these security vulnerabilities we've just talked about, and maybe what's our shared responsibility developers, users, all of us in ensuring these powerful tools are built and used safely and reliably? Definitely something to think about. I'd encourage you to check out the research paper itself if you want to go even deeper.