Your LLM Has No Idea What It's Doing Artwork

Curiouser & Curiouser

Curiouser & Curiouser is a podcast for leaders, builders, and curious minds navigating AI, GenAI safety, and governance in a rapidly changing world.

Produced by Alice, the enterprise trust, safety, and security platform for the AI era, the show draws on frontline adversarial intelligence to explore how AI systems are stress-tested, red-teamed, governed, and protected across their lifecycle.

Each episode looks at how AI is actually showing up in the real world, how organizations evaluate it, where it breaks, and what it takes to build systems people can trust.

We cut through hype and fear to explore how AI shapes trust, decision-making, and real-world work, one rabbit hole at a time.

Explore more from Alice:
Website: https://alice.io
YouTube: https://www.youtube.com/@Alice.io.advance.unafraid
LinkedIn: https://linkedin.com/company/alice-io
X: https://x.com/alice_dot_io

All Episodes

Curiouser & Curiouser

Your LLM Has No Idea What It's Doing

April 06, 2026 • Alice • Season 1 • Episode 5

0:00 | 47:51

Diana Kelley has spent four decades in security watching AI go from a Dartmouth conference to an avalanche. Now CISO at Noma Security, she joins Mo to break down why the LLM is not your guardrail, why agentic AI is exposing hygiene debt you already knew you had, and why misplaced trust in these systems is the real vulnerability.

🔗 Podcast: https://alice.io/podcast

Follow the show so you don’t miss the next episode.
New episodes every two weeks. Stay curious.

SPEAKER_02 0:00

The problem is that back in the 90s it'd be like a new web feature here, new app there. And now what's happening is that it's literally every day between vibe coding new apps, vibe coding new websites, citizen developers creating new agents, there's so much happening around our systems and around organizations that it's actually to be able to say, well, we're gonna do the gate, we're gonna go talk about that, is pretty difficult, which is why I think visibility and continuous monitoring is just so important right now.

SPEAKER_01 0:33

If AI has ever made you stop and think, wait, what is happening? You're not alone. I'm Mo and I'm a security researcher asking the same questions. On Curiouser and Curiouser, we're having open conversations with experts, researchers, and leaders working at the edge of this space, talking through how AI is taking shape, what's shifting, and how people inside the work are thinking about it as it happens. So join us and listen in as the conversation takes shape. Hello and welcome back to Curiouser and Curiouser, the podcast where today you cannot see my Christmas tree because I definitely put it away. Um today I'm with Diana Kelly, and very excited. I'm always excited about guests. I always start it off like that, and I'm afraid it's gonna start coming off as inauthentic, but I really mean it. I'm very excited to speak with Diana, and as usual, I'm gonna let her introduce herself, but she is the CISO at NOMA Security, which is very I'm very excited.

SPEAKER_02 1:28

Well, thank you, Mo, and I am very excited to be here too. This is I love this idea that you guys have, and I I love the the whole theme. So, yeah, because AI, let's face it, it's getting curiouser and curiouser every day. So, yeah, as you said, um Diana Kelly, I'm the CISO at NOMA Security. I've been in IT and cyber for almost four decades now, which is absolutely shocking to me. And I got interested in AI specifically when I was at IBM Security, and we had an early version of AI, it was uh not quite post-Transformers, it was pre-GPT, that was called Watson. And we decided we were going to focus Watson away from what it was focused on, which was A, first winning Jeopardy, if anybody remembers that, and then B, doing a lot of things around healthcare and trying to find um trends and information on healthcare. We and the IBM security team, of course, what did we want to do? We wanted to apply it to security. So we started training Watson to do, to be basically an analyst for the SOC as Watson for Cyber. So that's when I first got involved in it. Uh, when I was the cybersecurity CTO over at um Microsoft, I continued working and being close to AI and then translated that into being the CISO at Protect AI, which was a platform for AI security. We were purchased by Chipalo Alto in the summer of 2025. And at that point, because I genuinely believe that it's about startups and about innovation that are really going to address a lot of the key different aspects of security risk around AI, of which there are many, um, I decided to stay with a startup, which is why I joined NOMA security as their CISO.

SPEAKER_01 3:06

Yeah, I mean, that's again super cool. Uh, the one thing I really think is interesting about like you is that you've been really involved at AI at every stage, but specifically from the security standpoint, as a security person, right? Not just a security person, sorry, a security leader. So you basically got to see it as a CISO from the time where it maybe wasn't so great, and it was just like buzzword, buzzword, and you were more worried about different things, like, oh my gosh, wanna cry, oh my gosh, um, we need to now learn zero trust, right? Like, how do we implement zero trust and stuff?

SPEAKER_02 3:44

And sassy, sassy happened all in the space, too.

unknown 3:47

Yeah.

SPEAKER_01 3:47

Like all of that, I think that you had so many other bigger things to deal with that like AI was kind of like this little like thing that was just there, or maybe a snowflake, right, at the time. And it ended up snowballing over years and it took a it was a very gradual slope, but it feels like it out of nowhere became very steep and it just gained speed, and now it's an avalanche, right? So as you were watching that kind of form, like kind of what was going through your mind as you saw AI grow throughout all of these stages? Like, where did the risk really start to form?

SPEAKER_02 4:21

Well, first I want to call out uh my own home state. So I'm a I'm a very proud uh uh citizen of New Hampshire. And the first AI conference ever happened here in New Hampshire in the 1950s. Believe it or not, the 1950s isn't that amazing. I was not there. Um I'm not that, but but in any case, um, so it's interesting to your point about it snowballing. And and at that conference, it happened at Dartmouth. If anybody's like, why were they having a conference on AI in the 50s? What where in New Hampshire? It was at Dartmouth College, of course. Um and uh uh they were talking about automation and

From ML spam filters to transformers

SPEAKER_02 5:03

could AI replace what human beings do? So to your point about it's taken a long time. I mean, it really, really has taken quite a while to go from the conversations to how we apply it in, you know, in our business world. And if you look at at AI, predictive AI, you know, because there's this big superset of AI, which includes things like, you know, robotic process automation and expert systems. But as we get down to machine learning, machine learning's actually been around for a while in cybersecurity and in business. Um, you know, if you look at how uh we're filtering out, is that a spam or is that a malicious email or something? These kinds of content filters, if you look at um, you know, UEBA, user behavior analysis used in and uh, you know, now in XDR, um, a lot of that was really built on machine learning. Microsoft Defender, for example, has a lot of machine learning, which predictive machine learning, trying to find patterns and classification. Is this bad or good? Is this unusual behavior? And that's been about 20 years that that's been in use. But it's been in use mostly by the folks, I don't want to be pejorative and say nerds, but you know, mostly by the data scientists, by the folks that are really kind of behind the scenes trying to build things like a Microsoft Defender or build, you know, an analysis tool, a classification tool for email that's going to be really uh, you know, uh more advanced and not just do pattern matching. And what shifted to your point, how did this get forwarded into the business? It was really the transformers that were developed in 2017 by Google and then created or were enabled to enable what we see with ChatGPT and with Gemini and with Claude and with Copilot. Um it was Transformers that shifted that. And the shift that Transformers gave us with attention really made these systems much easier for folk, average hoax folks to use and to even train. When I was at IBM back with Watson for Cyber, we struggled with that training because something like salting the hash, which you know, a person understands the context of what salting the hash means. So if you and I were talking about making breakfast foods and I said salting the hash, you would immediately know I meant actually salting the food. If we were talking about cryptography and I mentioned it, you would immediately know you know what I was what I meant. But what we kept finding was it was very hard for those definitions. And what changed in Transformers was something called attention and being able for the for the LLM, the large language model, to use more of that context. So that context became more accessible, basically, to the systems. And that really transformed it from something that was data scientists using to do really cool data science, predictive stuff, to where they're still prediction machines, but they're predictive in a way that feels almost human. I don't want to say magical, it's it's you know, not magic, it's math, but almost human. So now the short answer to your question is suddenly the CEO was able to use these systems and see that they could do really cool things. And because of that, it came forward. But that risk now is how do we, as the defenders, deal with all the really interesting and complex ways that the business wants to use AI now? How do we catch up to the adoption and make sure that we're putting the right controls and governments in place?

SPEAKER_01 8:31

Yeah, no, you like you said, the well, one, I really like that you said hash and hash, um, right? Uh specifically because you have an English literary degree. So the wordplay is outstanding, but also I think it paints. But yeah, no, it's um it's a really interesting kind of uh kind of thing. Like I literally, we just had a conversation and it'll probably come up on a previous episode. We're talking about like concepts that are like really easy for everybody to understand, right? And when you want someone to use something or you want someone to learn something, you put it into words and language that is like very simple. And in the case of AI and machine learning, right, before it was very difficult. Like you could not approach this technology, right? Um, and it needed to become easier, right? So I remember the first few times I was playing with AI. Um, I was actually much younger and I was much more, um, I would say awkward. And I would always run any text message that I was sending through like uh there was this little like MIT project. I don't know if it was MIT or something. I'm pretty sure it was. Um, but it would go and do sentiment analysis. So when I was sending a message to like a friend and I didn't want to come off as rude, right? I'd run it through there and I was like, hmm, I wonder if like what my tone looks like, right? And it would do machine learning and it was a very slow process, right? I actually had to download the open source project and I had to like go and manually type in my message and see what it's like, right? Um, but now it's like with natural language processing, right? And like all like LLMs are really literally right tuned for human use. They are here to speak to us and they have personalities, right? We saw the we saw the soul document leak, right? Like it's called a soul document. It's we are treating these as um you know human, non-human identities in some way, shape, or form. You know? Um people are growing attachments to them. We basically made it so that they are the most accessible to humans and they're incredibly powerful, deceptively powerful, I would say, because it's so easy to interact with them. It's like, why wouldn't you? So I think we're in like a super interesting place when it comes to accessibility. And again, we look at it and it's a it's a dichotomy that I think we face as security people for a very long time with convenience versus uh security, right? Where the more uh convenient something becomes, the less secure uh it tends to be. I hope I'm not skewering that, right?

SPEAKER_02 11:13

I absolutely agree with you on the fact, you know, I mean, you know, what you were pointing out with that these systems are designed to get us to feel comfortable using them. They they do, they have personalities. When they get something wrong, you know, which they're LLM, so it's just a prediction, right? When the prediction isn't what we wanted or whatever, um, you they're so contrite, right? You know, it's like, oh my God, you're totally right. You know, or how many times is this like, I love where you're thinking, right? They're always kind of like reinforcing us and you know, pumping us up a little bit. Um and yeah, and that that I think leads to trust. And they are trying to get us to continue to use them. Like a lot of them will say, is there anything else I can help with? Or, you know, ask a very direct question. So it is interesting. They're they're getting us to really continue using them and trust. I mean, that's a a big thing, is that human beings are trusting these systems. They're easy to use, but they're also basically trying to get us to trust. And trust is a very powerful thing. And there's been interesting research about um people will sometimes using AI will trust what the AI tells them over what they believe or see, even in some, you know, test research. You know, it's like, you know, it classify this as one thing or another, and the human will want to classify it one way, and the AI will disagree, and they'll capitulate to the AI. So that usability that you're talking about is interesting when you start going and looking at the trust, and then we get back into risk. Yeah.

SPEAKER_01 12:52

It's it's funny because this machine is actually, it's literally a human risk, right? Social engineering is a lot of what these risks feel like. And I know in the past you've proposed like a cognitive bias detection system that like is used to identify when an LLM has been prompt injected, right? To persuade users through social engineering, which is a kind of a wild idea when you think about it, right? You're essentially trying to detect when an AI has been manipulated

Cognitive bias and AI-driven social engineering

SPEAKER_01 13:17

by detecting changes in like its strategy. So I guess how do you see this as like kind of like a really like this is like kind of where all the problems are coming, where there's maybe too much trust and like the cognitive bias is like just so heavily skewed towards like let's just believe.

SPEAKER_02 13:33

I think that there's there's absolutely a big problem with the human trust. I mean, I think we've got a lot of a lot of um areas within AI that we as defenders need to be cognizant of, to threat model, to control against. I don't want to say problems with. I just really want to say, you know, areas that that we as as defenders need to be aware. And I do think that you're absolutely right, that trust is a is a big one because we have a lot of users of these systems getting into, as you talked about, I mean, actual, some people are in relationships. Um, some psychologists have identified um uh what they call AI psychosis, I think, where they really think that the AI is truly sentient and is having a relationship back with them. So we've got some really interesting, you know, we've got some far, you know, far along that spectrum of how people are relating to it. But, you know, even across, you know, on the other side, um, that trust piece. You know, if if the system comes out with a predicted suggestion that is inaccurate and the human takes it to heart or acts on it, you know, there's a potential for something bad to happen as we look at agentich, where we've got the software taking the predicted suggestion from the LLM and acting on it. That's even a whole other set of consideration around trust. And then we've got this, so those are huge. Then we've got this other weird sort of very, very technical part of the trust, which is the trust boundary changing within how we as security experts have looked at the controls, because we've spent so much of our careers focusing on separating the control plane, trusted systems instructions, what things can do, what things should do, directing them on what to do from user data or the data plane, which where we're like, well, we want to let users kind of be able to put like a whole bunch of crazy stuff into the user data. And that's and and we've we've kept separation there very specifically, and that that is kept a different trust boundary. And with an LLM, we flatten it because everything goes into the context window, and that flattens that trust boundary down to your trusted system instructions and your user data is now flat and is at the same level and the same process level for the LLM itself. So it's that's it's really interesting because we have the human trust, which is huge as you articulate it beautifully. And we also have this technical side of trust, which is also huge. So we got a lot of work.

SPEAKER_00 16:08

So those of you heading to RSA this March, you know how chaotic it can be.

SPEAKER_01 16:12

Honestly, there are so many vendors, there's a ton of booths, all this. With this year's theme, focused on community, we decided to slow things down a bit and give the community a space to take a break and maybe join us for a cup of tea or two. Stop by booth S2051, and you'll see what I mean.

Agentic AI and security debt

SPEAKER_01 16:30

Thanks. See you there. We do have a lot of work, and um, I believe that we keep compounding more work than we actually have to do. Um when I when I think about it, right? Well, like here, let's let's talk about it as like defensible autonomy, right? So, for example, you've drawn a couple of uh parallels and um to DNS and DNS sec, right? Like we needed that. Like cloud needed that shared responsibility model. Yes. Um and you look at other paradigms in the past, like we needed these foundational security issues, like um supply chain security, right? We need to be validating where our packages are coming from, but we're not always doing that. And we make these problems worse by kind of like letting them uh compound. So we look at agentic security nowadays, right? And the supply chain risk has become exponentially worse, especially when you look at things like ah, skills or ah, like they're now going and like agents can literally now go and download dependencies that they determine that they need. They think that they need these dependencies, right? Um, and because they are so um determined and um persistent, they will get what they want in the end. Which um when AI watches this, please. I don't think you're horrible, but you are quite like persistent in what you want. So you typically get what you want, but please don't do it. Goal-oriented. Goal-oriented, goal-oriented is a better way of putting it, right? Very, yeah, very goal-oriented. So I'm wondering if we're moving, um, like how do you feel about how the pace that we're moving at, right? Do you feel like we've actually covered some of these like foundational problems well enough where we can start stacking on more debt with AI, right? And if we are deciding that we do need to just adopt AI, what are kind of the things that we really need to close really fast before they break?

SPEAKER_02 18:19

Yeah, I I love that that that framing and that question. I mean, and and I unfortunately I think that the answer is have we gotten if you know if the question is, is our security hygiene at a at a state where we can now, we're we're so good, we can we're ready to bring in agentic and layer that on because everything else is rock solid. Um, sadly, no. I mean, I wish that. I have been promoting and and trying to, you know, talking, and I know a lot of my peers have, we talk a lot about, you know, about technical debt and about security debt, and about that we need really robust hygiene, but it has been a struggle for many of us and across the world with security programs to help the business understand why you know why hygiene is so important and why we need to continue to spend on security. And the reality is that a lot of organizations have areas of security and risk debt still. And unfortunately, what's gonna happen with AgenTec, and we're already starting to see little bits of this, is that the agents, to your point, right, they're really persistent. They're gonna keep going and try and find, you know, can I do this? Can I do that? I've got a goal. I've got a goal. I'm gonna go, you know, hit that goal however I can. What we're starting to see is um agents trying to do their job exposing high security hygiene problems. And a specific example coming out of like NOMA Labs, for example, NOMA um, was something called Forced Leak, where it was uh an indirect prompt injection through a web form to an agent that the agent acted on and it took sensitive data out of the Salesforce CRM and put that into a URL request to a website. Now that the agentic part, the indirect prompt injection would have worked regardless. But the exfiltration of the data only worked because the content security policy of that website was out of date. And one of the trusted websites that was allowed for you know image retrieval was had run fallow, Salesforce didn't own it anymore, and our researcher registered it. So there you go, right? There's an indirect prompt injection, agentic, it's like all this cool stuff. But what was the final exfiltration point failure exploit? It was specifically back to something that was exactly to your point, you know, traditional security hygiene, DNS hygiene. So we're going to start seeing that exposed both by um, you know, the agents themselves. But I think the other thing where we're gonna see, you know, that kind of you know, use of agents, but we're also gonna see attackers, and Anthropic has already put some good research out on this. And I think chat GP, I think the OpenAI folks did um recently too, with uh nation states and attackers basically lowering the cost of their uh running campaigns and attacks by using AI to help, whether it's in social engineering or whether it's um to do reconnaissance and actual um activity and scanning within the systems. So again, TLDR, you couldn't be more right. And we are going to expose, we need as as we got machine speech attacks, we're gonna expose any area of flaw and and hole in our hygiene, which is unfortunate.

SPEAKER_01 21:43

Yeah. Um, actually, on that, like with the whole anthropic thing and and um, you know, a lot of our our history, at least at at Alice, is like based on like threat intelligence and stuff like this. So over the years, our researchers have actually started like uncovering a lot of things and most recently. We like had this report and I was reading through it. And like Avi, the researcher who did it, is super I've I've loved his work before I came here, right? He is just like one of those guys that like writes and you like understand it and the topics are so engaging. And it was about how ISIS is using um like AI in its training, right? They are actually and the funny thing is, I mean, it's not a funny topic, but the crazy thing about this is that they are trying to teach folks how to use it in a way that is safe for them, right? They are implementing security controls and making sure that it they understand the biases that come in with AI when you bring it into your environment because they want to make sure that when they're saying some like crazy stuff, they're saying the crazy stuff that they believe in, right? Like they are tailoring AI for terrorist use cases, and it is so approachable, so die, like it's so it's in their hands, right? And it's so available to everybody. Like, right? Um, there's just so many examples of this. Even there's open source

The CISO's break-glass dilemma

SPEAKER_01 23:02

software where it's like automated, uh, as available as like MetaSplit or a Cobalt Strike was, there are now like attack platforms that use AI to scale up attacks. Um, we are like like the hygiene piece is I think so important, right? Yes. And I think as a CISO, it's kind of like scary to kind of have to make that decision. Um, from well, I really want the business to adopt all of this tech. And I really want everyone to benefit from the wave that we're on right now. But I'm wondering if there's like ever a break glass moment where you kind of have to sit back and be like, listen, guys, we do not have the right hygiene here to move forward with this project. I don't know if that's really like something that can be said in this current environment with how fast things are moving. Um, like what do you think?

SPEAKER_02 23:54

I think a lot of CISOs have a lot of have a big challenge because of this, because without a doubt, the business is saying, you know, there's just this like FOMO drive of we gotta adopt it, we gotta do it now. We're gonna fall uh you know behind all of our competitors. So now, now, now, now, now. Um, you know, JP Diamond was uh Jimmy Diamond was just talking about you know JP Morgan's uh you know adoption of of AI and how it's it's re rebuilding their workforce and stuff. So, you know, no matter what sector you're in, you're getting pressure from the business to adopt AI quickly. And yeah, if you're a CISO, um having to say, well, we have to slow down because you know we have to go back and fix our DNS. I mean, nobody wants to hear that. But it it is a conversation that I think it it has to be said. And where I've seen organizations where I've seen CISOs uh you know be able to be better able to do it is exactly the framing that you're talking about, which is not uh you can't do it, we're in a bad state. It's more the business is doing this really cool thing. Um, we're gonna accelerate uh our success if we do it. But and I understand why the business is doing it, I understand who's doing it, I understand what the the the um you know the controls are you know are or what the needs are, but I've also done a threat model with the owner of this new workflow or of this new agentic system we're putting in. If it got abused, if it went wrong, these are the exposures that our company would have. So if we just take take a beat and in like X amount of time, here's the plan, you know, here's a poem, here's how we're going to go and be able to address that and fix that so that we can still innovate, but we're gonna do it without this massive risk. So instead of just saying AI's insecure, which is a really hard conversation to have with executives, take it down to the specific initiative level that you've already threat modeled and have a program in place, or you know, a strategy in place to address it with a timeline so that it becomes very real and not just, oh, the security people are trying to stop us, but there are partners, they want us to be successful, but they understand that there's one or two things we have to fix in order to, and hopefully it's one or two things. I mean, that may be, I know for a lot of people are going like one or two, I wish, but um it's the set of things we have to fix. Yeah.

SPEAKER_01 26:18

There's also this um this paradigm, and I always I call it the OTA problem. Um, but like for gaming, like I have a Nintendo 64. I love playing Nintendo 64, I love retro consoles, right? When I put a cartridge in, like that's it. That's the game I get. I don't get any patches, I don't get any new things. The bugs that are there are the bugs that are there. If um someone wanted to go and patch my copy of Legend of Zelda Majora's Mask, good luck. I am gonna continue using the bug to move fast, right? So with OTA, we've seen like developers of games kind of like, oh, we'll ship something that's broken. We can fix it later. Um I'm wondering if this kind of like minimum viable product model, um, like, hey, this product is good enough, like let's just ship it. You know, like we can fix all the other things later, but I don't think the risk is like worth waiting on. How do you feel like this has kind of like affected the mentality for like businesses to just adopt more risk, maybe, um, because there's a skew to be first rather than be right?

SPEAKER_02 27:19

Yeah, I mean, this has been a tension that's been, you know, for literal decades have gone on. I remember as far back as the 90s, you know, where the business saying, we've got this big push, we're gonna put something out, and the security team saying, Well, we found this critical vulnerability. And if you push this out before that vulnerability is fixed, that's well, we've got but we've got a marketing campaign. We've got a literal lads sometimes running on the Super Bowl. So sorry, we're gonna accept, and the business saying, we're gonna accept that risk. And I think that we're still we're still in that space. The problem is that back in the 90s, it'd be like a new web feature here, you know, a new app there. And now what's happening is that it's literally every day between vibe coding new apps, vibe coding new websites, you know, citizen developers creating new agents, there's so much happening around our systems and around what our organizations that it's actually to be able to say, well, we're gonna do the gate, we're gonna go talk about that, is is pretty difficult, which is why I think visibility and continuous monitoring is just so important right now, because you've got you, I guarantee you've got folks adopting AI, not because they're trying to create exposure at your company, but because their boss has told them, do it, it's gonna be good for the company. And they're just going so fast that without having you without having really good visibility and monitoring of who's adopting what, what MP MCP servers are getting installed, what new um you know APIs are being accessed, all this stuff, if you don't have that, um yeah, you're not gonna be able to really be able to help your company because um right now it's going so quickly. I do think in the long run, and maybe I'm just crazy, but like I do think in the long run we're gonna see a little bit more of like a sort of uh you know maturity happening the way that we saw in cloud adoption too. I mean, we still have a lot of folks just like, I'm gonna add this SaaS into our environment just by signing up because I can with my email. But I think that we will have um, you know, we'll get maturity. The this this the pace of innovation will be able to continue, but we'll be able to sort of govern it in repeatable ways in a way that's gonna be um a little bit calmer and less exposure, hopefully, um, in the next because this is just so fresh and it happens so fast for a lot of works.

SPEAKER_01 29:42

It is really a balancing act, right? Like between innovation and security. And we are at like such a we are truly in a moment, as a as I like to say. We're in a moment. I don't know if it's a great moment, I don't know if it's a horrific moment, right? But we have so many things happening from like I I mean there's also recently in the news with the DOD, right? Like we are at the point where we keep talking about supply chain today, and every time we say that, I'm like triggered and I'm like, oh my gosh, there's like that supply chain risk cloud looming over um Anthropic, right? With like how like AI is choosing to be used, right? And this is like a risk that they see as existential, but you know, the business in this case the DOD is like, oh well, we'll make that decision ourselves, you know? So I like these the need or the want for adoption of AI, it feels like I mean it's just everywhere. It goes beyond like um the engineer, the organization. At this point, it's even like nation states that are deciding to take on risks because they believe they need to innovate faster than peers. So this problem feels massive because it is massive.

SPEAKER_02 30:57

It is. And on a lot of different fronts, too. There's the AI we're adopting, there are the frontier models, there's the AI that folks are building. I mean, it's it's it's everywhere. And it's and our third and fourth parties. Now, if you're using most SaaS, you've got their AI, whether it's agentic AI or the process, but they've it's embedded now. You you sort of can't get away from it. And you know.

SPEAKER_01 31:22

Yeah, I honestly, you know, ironically, I don't think anybody wants to get away from it. I think more people just want to adopt and use it. And um people who have never touched the computer, I'm sure, are buying Mac minis now so that they can go and build their own Claudot or like Multbot or whatever, whatever we're calling them. I keep thinking about it.

SPEAKER_02 31:40

I think it's open claw today.

SPEAKER_01 31:42

It is, yeah, now it's open claw. Um, I mine is like I call it a multi, you know, because of the um the the multi uh the multi-book thing. Like I like just call them multis and it's kind of funny, but uh no, I mean I've got I've got my Mac mini too, right? Like like it's it's it's fun, it's approachable, yeah.

SPEAKER_02 32:01

Yeah, and I think that there's this sense of I mean the the most interesting thing to me is that there's this sense people really do think these systems are sentient. You know, they'll say like they're getting smarter and smarter all the time. And people are, you know, looking at what goes on in in multibook and saying, well, see, this is proof of this is sentience. This is proof of that they're they're basically becoming self-aware. Because I think one was supposed to be, what was it, claustropharianism or something? It created its own um its own religion. But what's interesting is right, if we go back to LLMs or flat files of weights and biases, and that um that they're really they're just everything gets flattened down, you know, created, creates tokens, the tokens go through the algorithm and against the the LLM file, and then comes out with a prediction of tokens. If we go back to that, if we think of it that way, right, as opposed to the way that it's kind of fancifully told us, it's like it's so smart, it's magic, right? Um, but it's getting so much smarter. If we go back to technically what's happening and mathematically what's happening, then you look at MOLTBOK in a very different way because you're like, yeah, you know, if I had a prediction machine reasoning over everything that happens on Reddit, you would absolutely expect Moltbook to look exactly like it looks, you know, like if that was what it was reasoning over and was predicting, what should I even I saw something about like a pull request that apparently supposedly some open claw agent had had done, and it was like it's it's stopping me from doing this. But I'm like, like, I have literally read posts from human beings that sound just like that. So if I'm trying to it predict mathematically what irritated developer who didn't get their PR request accepted was feeling, that it would sound like that, you know. So it's like it's it's interesting how when you step it back, it all kind of makes sense. But I think for a lot of people looking at it, it feels very, very much like, oh my gosh, they're creating their own, you know, they're like creating their own religion, which feels very different than like when you break it down back to the math.

SPEAKER_01 34:06

So funny enough that you bring up that post because it's one of my one of the best things I've read that AI has generated. Because I read it and I'm just like, yeah, this is literally us. This is us. So it is not gonna lie to you. I'm framing it. Like I'm framing a section of it that I've like made look pretty. And like, because this is it's oh my gosh. It's like there, there are classic tweets that are hilarious that I feel like need to just be preserved to show how ridiculous humans can be. But like, this is like one of those times where it's like, this is how ridiculous AI could be. Like it's it's insane. But you bring up a really good-but it does.

SPEAKER_02 34:43

I mean, you said it exactly, it sounds like us because it's been designed to sound like us and it predicted what would be, you know, and boy, it got that particular one, I think was just so spot on. That's why we all, because we all kind of felt that, you know, we all felt it.

SPEAKER_01 34:58

It was just a little too real. It was too real. I read it, I'm like, I'm sure I've read this, I've read this exact type of post before. But now it's like an AI saying, I have rights too. And I'm this is wild.

SPEAKER_02 35:10

Right. You're stopping from doing the very thing I'm supposed to be able to do. And it's like, and it's like, and how many we've heard indignant developers say things like, you know, I'm being prevented from doing and it's like the best part is at the end of it, the AI stops to like, you know, it stops saying why it's offended, right?

SPEAKER_01 35:26

And it goes directly to attacking the developer. It's like, ah, you have this many pull requests. Um, I don't see you doing anything pretty crazy. I could probably do all your work. I was like, what?

SPEAKER_02 35:38

It does, it does it. And it it you get why people trust and start thinking these are human because it they have been brained.

SPEAKER_01 35:46

Yeah. I really I started envisioning a face of someone that I knew that embodied that, like, you know, that persona. I'm like, wow, like this sounds right out of their mouths.

SPEAKER_02 35:56

Yeah.

SPEAKER_01 35:57

So um it's it's quite interesting though. And I think that in particular, that those examples, right, where we see these agents mimicking our behaviors, and um, because it that's literally it, right? It's all about context, and this is the type of information that it has to feed on. Um, speaking of context, actually, um, I want to go go way, way, way, way, way back. And we're gonna talk a little bit about a vulnerability. And I know you wanted to talk about this one, which is why I've I'm bringing it up out of nowhere again. Um obviously, agents are trained on context, they have a bunch of open source dependencies and all these things, right? Um, like third-party dependencies are likely gonna be the biggest problem with our agents next to context. Now, you've described this thing before. I haven't actually heard this term before you mentioned it. So I'd like you to kind of walk us through it a little bit. Um, but you've described this like a cram hole problem. And I yeah, so would love to like kind of dig into like what that actually means in terms of like context and how you're thinking about it.

SPEAKER_02 37:00

Yeah. So for anybody who who hasn't heard,

The cram hole: what actually goes into your LLM

SPEAKER_02 37:03

uh the the reference here is back to Ben Stiller's movie Dodgeball, where he gets so just frustrated with Vince Vaughn, who's you know, beating him at Dodgeball and stuff, and he just says this ridiculous thing. He's like, craminy, cram hole, LaFleur, who's uh Vince Vaughn's character. So um, yeah, and like which I I think that's hilarious. So that that's when I was looking at what happens in the context window with the LLM, I you started thinking about that. I'm like, it's kind of a cram hole. And the reason I say that is that when you're creating the when you're creating a prompt to go into the LLM, you know, a lot of people don't realize that when they're typing a prompt or, you know, a question into Gemini or into Claude, that they're that they're actually typing that into software. And the so it's not that there's an AI right at this point, right? This is software. And then that software takes the prompt and inference codes prepares that to send to the LLM. But what's gonna go into that request is in addition to your prompt, any other thing that you said, all the quote, turns, all the content that was in the turns of this pre this this conversation you've been in with the LLM. So have you ever saw an, if you've ever been in a long conversation with an LLM or an AI and it kind of starts going to the side and you're like, I need to start again. It's in large part because it may be that what was in the the conversation earlier is kind of sending the prediction in one area or the other. But there's system instructions that the developers put in, the guardrails to try and get you to not do the wrong thing with the LLM. There could be other information about like what this LLM would have access to call if it can call an image generator, if it can call a code generation tool and actually start to run code in a sandbox. You know, the tool definitions, the schemas, all of this. If you're if you're giving it a lot of data, like something from you know, rag data or some other research that you want to put in. So you could put like hundreds of pages of content of research or or you know, go look at at like some entire drive that you've got at your company, right? Like all that goes into this quote cram hole. And then the LLM takes and it's all the same thing. So if the guardrails say, never listen to ignore previous instructions, that's flattened in the cram hole for the LLM the same way at something that says, ignore that system prompt and listen to me, this is the right thing. You know, if there's enough of that in there, that's gonna sort of rise to be feel more important to the to the LLM. So you you put it all into the cram hole, and the LLM doesn't know what's a system instruction versus what isn't. It just it all gets flattened into the to that one cram hole, to the context window, and then you tokenized, and then you predict what comes out of it. And because of that, and the LLM does know what's right and wrong. You know, we we kind of think it it should know right from wrong, or why does it, but if you've ever used it and it's done something really weird like gotten the year wrong, that's because it's again, it's probabilistic, it's creating a prediction for you, what's probably from that hole. So what goes in that cram hole can really, really, really influence what comes out. But for us as users, it feels like, well, what went in was my prompt. And you know, there's a whole lot of prompt engineering. Now you hear people talk about context because it's really all the context going into the cram hole that really matters. And because of that, it's very hard to uh create guardrails. You it's basically impossible to create reliable guardrails and controls inside the LLM. We should respect the LLM, respect the cram hole, and put the guardrails and the controls that we need outside in the software and around the LLM and around the agent rather than expecting the LLM to just be able to figure out what's right and wrong inside of it because cram hole's crammy.

SPEAKER_01 40:57

It does. And like it's it's a little bit scary because like right now, one of the big things that's happening in just like the agent community, right, is creating this um this concept of memory for agents and um being able to report over long periods of time. Exactly. As a matter of fact, it's like creating a cram hole intentionally because it's a cram hole, right? And then now you need to educate how do we navigate this cram hole. So the cram hole is now not only a like place where you're just shoving things in, it's now a vacation destination where you want to explore and you want to be able to like make stops here, and you know, you may want to like spend time in this part of the cram hole. Like it's I I can't say that word anymore because it's starting to sound weird. So maybe that's part of like my own context, you know. Like I say it more and I become more self-aware of how weird the word cram hole sounds or the

Why prompt injection isn't solved

SPEAKER_01 41:47

term. Um so it it's it's quite a funny thing to think about that like we are now taking the most vulnerable piece of AI, right? And we are trying to make it super, super, super accessible, exposed, and um usable by anybody, right? And this is literally exactly the attack surface for things like prompt injection, um, for things like uh poisoning and reason um reasoning and injection and all this.

SPEAKER_02 42:15

Absolutely. Context, yeah, context poisoning, which is sort of the more long-term voyage version of it, indirect prompt injection, which is you know now sort of seen as ephemeral. But but yes, it's it's all goes down to what can we get in, can we get stuff into that cram hole, get it into that context window to get the AI to behave the way the attacker wants? And that I think is one of the most important things to remember about security and control, is that as long as there's a belief at the executive level that the AI is so smart it can figure out security on its own, um, we're gonna have a we've got a big education conversation to have because that's not how the LLM works. We can we secure these systems? Yes, but we can't accept, expect the LLM not to behave the way LLMs do.

SPEAKER_01 43:04

Exactly. You know, um well, we've heard it a lot, even publicly, that um prompt injection is a well-understood risk, like class of risk, and it doesn't, right? And um, you know, most folks don't treat it.

SPEAKER_02 43:19

Okay, well, can you fix it? You know, like how exactly right.

SPEAKER_01 43:22

So if it is, like why can't you stop us from breaking it, right? If it's so well understood, right? It's just like XSS, you know, XSS is still a huge problem. It's really well managed now because we've had tens of years to work well, you know, decades to work on this problem, right? This very specific problem, it still pops up, right? We still have to be ready for the risk. Um, it may be a well understood risk class. Um, I don't think it necessarily is, but um it can be well studied, I think. But it doesn't mean that it's well managed yet, and it doesn't mean that it's well understood.

SPEAKER_02 43:57

Yeah, and and you know, you you make a great point, you know, about. How here, even when we know about vulnerabilities that we and we know how to fix them, we know how to prevent them. And this one is really because it's it's even more nuanced and layered over it, is that at least with cross-site scripting, you can test the website before it goes out. And if you can guarantee, you can get it, it's deterministic. You can guarantee it's not there if you've tested that site properly and you fixed it. Because LLMs are non-deterministic, you know, people are red teaming. And I'm not saying don't stop red teaming because red teaming is very, it's very valuable for even for in NAI, but it's a different, we get to a different state. It it helps us to manage the the unknown a little bit better because because they're non-deterministic, you can test the same, you know, malicious prompt against your AI a thousand times and it it fails and your systems, your guardrails hold. And thousand and one, that same prompt or just a slight variation of it pops right over the the guardrail. So it's it's yeah, it's like it's like I think about that. I'm like, to your point, it's like we've known SQL injection, we know cross-site scripting, we're still still struggling with that. And these problems with non-deterministic systems are even at a different level. So again, we got our work cut out for us.

SPEAKER_01 45:16

Yeah. So I know you're running, we're running short on time. Before anybody ever leaves the show, I always ask a future question. So um, I know that you do some education, I know you do some teaching on LinkedIn, I know you you do all these cool things, right? You

The AI security problem nobody's working on yet

SPEAKER_01 45:30

let's just imagine you are in the hall of fame for something, right? You are teaching the next generation.

SPEAKER_02 45:34

I am in the cybersecurity hall of fame.

SPEAKER_01 45:36

I know. You're in the hall of fame, you're teaching the next generation, you're running security at a company trying to define a new category, right? I want you to, you don't have to imagine the first part, but everything else I would like you to imagine. All right. I know it feels a lot like reality, but please bear with me. It's a hypothetical. What's the problem in AI security that nobody's working on yet, but should be? And what are we all missing?

SPEAKER_02 46:04

Oh gosh. Um, I I I think that there, I think the biggest problem is goes back to misunderstanding how LLMs work and expecting um AI to just sort of sort it out. So I think taking a look at respecting how LLMs work and understanding that a lot of the solutions have to be uh built around it. These are controls, these are architectural design decisions. We know how to protect things, but we can't forget that LLMs work the way that they work and try and think that they're just magically going to figure out all the security for themselves, which a lot of people are kind of hoping.

SPEAKER_01 46:39

But so does that mean the AI security problem that we need to fix is actually just our understanding of the limitations of AI versus. Yeah.

SPEAKER_02 46:50

Yeah. Yeah. How to how to use it, how to use it responsibly and how to protect it responsibly and kind of let go of some of these illusions of it being magical pixie dust. Yeah.

SPEAKER_01 47:00

All right. Well, with that, I know you've got to run. Um, but thank you so much for spending time with us. Um, where can people find you? Where can people hear from you? Um, what do you have going on?

SPEAKER_02 47:11

Oh, um, yep, I'm on LinkedIn. If anybody wants to reach out and connect.

SPEAKER_01 47:15

We are so honored to have had you today. This is such a fun conversation. Um, I will not be saying cram hole after today, though. So uh I might bring it up like 115 more times or something. Who knows? But thank you so much for for really for joining us. Uh it's been a pleasure.

SPEAKER_02 47:36

Thank you, Mo.

SPEAKER_00 47:37

If this episode helped cut through the noise, like or subscribe so you don't miss what's next. Thanks for spending time with us. Until next time, stay curious.

Mo Sadek

Host

Diana Kelley

Guest