Why Production AI Exposes Security Gaps Organizations Can't Ignore Artwork

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology

AI deployment and adoption is complex — this podcast makes it actionable. Join top experts, IT leaders and innovators as we explore AI’s toughest challenges, uncover real-world case studies, and reveal practical insights that drive AI ROI. From strategy to execution, we break down what works (and what doesn’t) in enterprise AI. New episodes every week.

All Episodes

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology

Why Production AI Exposes Security Gaps Organizations Can't Ignore

February 11, 2026 • World Wide Technology: Artificial Intelligence Experts • Season 1 • Episode 64

0:00 | 46:44

As enterprises move AI from experimentation to production, security failures are no longer isolated incidents—they are systemic risks embedded deep in infrastructure, data flows, and decision-making systems. In this episode of the AI Proving Ground Podcast, WWT's Istvan Berko, Cisco's DJ Sampath and NVIDIA's Ofir Arkin discuss why as AI becomes core infrastructure, security becomes the mechanism that determines whether that infrastructure scales or undermines itself.

Support for this episode provided by: Rubrik

More about this week's guests:

Istvan Berko is the Global Head of AI Cyber and Innovation at World Wide Technology. With 25+ years in security, risk, and governance, he has held senior roles at NTT/Dimension Data and AWS, authored AWS Cloud Adoption Framework whitepapers, and remains active in the cybersecurity community through leadership and industry events.

DJ Sampath is Senior Vice President of Cisco's AI Software and Platform group, where he leads the charge in shaping a unified AI vision across the company's product portfolio. A builder at heart and a visionary by design, DJ blends deep technical expertise with strategic storytelling to drive transformative outcomes at scale. He has founded, grown, and sold category-defining startups, advised U.S. government agencies, and emerged as a thought leader at the intersection of enterprise AI and innovation.

Ofir Arkin is a renowned information security expert with a career spanning academia, consulting, and executive roles. He's passionate about creating innovative products that address customer needs, and has introduced several industry-first technologies. Known for his dedication to mentoring, Ofir has authored numerous influential research papers and articles, and is a recognized speaker in the information security community.

The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.

Learn more about WWT's AI Proving Ground.

The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.

Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.

Why AI Security Suddenly Matters

SPEAKER_03 0:00

From Worldwide Technology, this is the AI Proving Ground Podcast. As organizations mature their AI posture, it's introducing behaviors that can drift, agents that act on behalf of other agents, decisions that don't fail loudly when something goes wrong. In other words, small compromises can quietly cascade into lost accuracy, wasted compute, and broken trust. That's why security has become central to the conversation around AI factories, not as a bolt-on, but as part of the infrastructure that allows intelligence to be produced safely, efficiently, and at scale. Cisco's secure AI factory with NVIDIA is grounded in that reality. It's about securing non-deterministic systems, protecting the full life cycle of models and data, and doing it without stealing performance from the very GPUs driving that value. So on today's episode, we're talking with three experts who see this problem from very different, very practical angles. Iastvan Burko leads AI cyber and innovation here at WWT. He spends his time helping large enterprises move AI out of the lab and into production and dealing with what breaks when security wasn't designed in from the start. DJ Stampith is Cisco's senior vice president of AI Software and Platforms. He's focused on what it actually takes to secure AI systems end-to-end at the scale enterprises are now demanding. And Ophir Arkin is a manager and senior distinguished architect for cybersecurity at NVIDIA. He works at the hardware and platform layer where performance, networking, and security collide, and where small design decisions can have outsized consequences for how AI factories behave. Together, they'll detail where the enterprises are exposed, why traditional approaches are falling short, and what security looks like when AI is no longer experimental. So let's jump in. To the three of you, thank you so much for joining here today. DJ, we have you out in Silicon Valley, Esteban in uh Dallas, Texas, and and O'Fir all the way out in Israel, uh, at least compared to WWTHQ here in St. Louis, Missouri. To the three of you, thank you for joining today. How are you? Great. Thanks.

SPEAKER_02 2:02

Thanks for having us. This is uh, you know, super excited to be here and uh and and excited to uh hang out with all of you.

SPEAKER_00 2:10

Yeah, looking forward to it.

When AI Stops Being Predictable

SPEAKER_03 2:11

Awesome. I I do want to level set a little bit here. So the Cisco's most recent AI readiness index had, I'm gonna read this stat, 29% of companies believe they're adequately equipped to defend against AI threats. And maybe even more alarmingly, 33% have a formal change management plan for guiding responsible adoption. So clearly there's a lot of room for growth in this securing AI area. DJ, I want to start with you. You know, Cisco recently also rolled out a new framework to identify and mitigate security threats and risks as it relates to AI. You know, that is giving leaders, you know, an idea of what they're missing from a security landscape. So, you know, can you dive in a little bit about where we think organizations and IT AI leaders are getting wrong as it relates to securing these AI systems?

Why Risk Models Break

Inside The AI Factory

SPEAKER_02 2:59

I think the the most important thing to sort of understand when you sort of think about these AI systems is that you're introducing non-determinism into the stack when you're introducing AI models, right? If you've used Chat GPT, you know, you all know this, right? When you ask the same question multiple times, each time you don't get the exact same response. You get slightly different responses based on what time of day you're asking that question. And that and the reason for that is you know, you're you're fundamentally introducing models into a typical stack, which used to look like you know, you had your your data, you had your you know, your your infrastructure, your data, your application, your users using it. But now what you've done is you've introduced models into that mix and and your agents and applications are using these models to be able to take certain types of actions, behave differently. And so it's it becomes really important to sort of understand that you're looking at a stack that's fundamentally different right now, right? And and so the very first thing you need to do is to make sure that you have the right taxonomy to be able to talk about the types of threats that you're seeing, the types of risks that you're introducing in your stack. And then that's really where we've doubled on and we've done some amazing amount of work. And what you just talked about, the framework that we launched, is is it's basically offering you a fundamentally different approach, right? For since one of our first holistic attempts to classify, to integrate, and to operationalize the full range of AI risks, you know, all the way from adversarial threats to content safety failures to model and supply chain compromise and agentic behaviors, and and more importantly, the ecosystem risks and organizational governance that you need to put in place to be able to think about this in a completely holistic basis. And and by the way, we did this in a completely vendor-agnostic way that that makes it, you know, for folks like you know, on the call, like you all, like WWT and Nvidia. I think I think this is really an exciting opportunity to be able to say, hey, this is the framework that you can align on to be able to get every single practitioner to start to speak about this, you know, in a in a in a more unified way so that it's not completely fragmented. And that's the risk, right? It's a new thing, it's fragmented, everybody's trying to find the right terminology. You gotta bring this all together so that people can start thinking about this more in a holistic fashion.

SPEAKER_03 5:09

Yeah, Ophir, build on top of what DJ said there, you know, bringing it all together. Why is that such an important aspect to all this? As it relates to securing these systems, which are are constantly on the change.

SPEAKER_01 5:22

I think that, you know, relating to what DJ said, uh, you know, not a lot of people understand that, you know, AI workloads runs on specialized uh data centers, right? We call them AI factories. It's not uh it's not the same as your regular data center. And AI factories really a purpose-built, accelerated computing infrastructure that is designed to actually create value from uh from data. And it does so by managing the the entire AI lifecycle, whether this is from ingesting the data to training, to fine-tuning, and to high-volume AI inferencing. And the primary product that we are discussing here is intelligence, and that intelligence is actually measured by token throughput. And that's the goal, right? To to generate as much as intelligence as possible, which drives uh our decisions or the AI decisions, actually, automations and of course other AI solutions. Now, AI factory the the focus is when on the operational side is throughput and efficiency. And the goal, as I said, is to maximize the production of intelligence. And that's why it's so much different. It is purpose-built, it's built for something else. The idea is to maximize on its value. And that's why it's also so much different than the regular data center that we all know very well. I would also add that it's much more complicated. Uh it's much more complicated to understand. And when you add, you know, six months ago or eight months ago, we would talk about reg, right? Single agent talking to different data sources. Take six months, and we are talking about a Gentic AI, which is, you know, n n dimensions uh of a problem that we had a single dimension. So you take that and you multiply that, and you add that, that it's we don't understand how eventually those decisions are being done by the AI models. And adding to what DJ said, right, it creates confusion, it creates misunderstanding. And we need to align people to to really understand where the attack vectors come from so they can defend themselves the right way.

Tiny Flaws. Massive Impact.

SPEAKER_03 7:48

Yeah, Istanban, um, you know, maybe to to close out on kind of this opening segment here with you, how much of a ripple effect is that having? I I read a stat that said even just a very tiny like you know, issue or you know, breach in a training model, something like 0.001% could have a 30% impact on like accuracy for these models. Is that just the state we're in now that like small security issues can balloon into much bigger security issues down the line? Is that is that an accurate read there?

SPEAKER_00 8:20

Absolutely. On the training side, right, a lot of these models leverage long-term training, right? So we're talking about spending huge amounts of resources within these data centers over months, right? Taking all that information, training it. If if some attacker can get into that and poison that data set and manipulate some of those training pipelines, then you're actually, it's not only that your your outcome has been essentially tainted, it's also that you've lost a huge amount of compute over time, right? And that has direct translation to the cost and the ROI that you're gonna get back from that infrastructure. And as of you know, as DJ was saying as well and uh uh fear, um with the genetic uh growth, and as you know, a year back this people were talking about it, but it wasn't really implemented in uh in practice very often, right? But the growth is essentially about the same if you look on a linear graph between uh the um the standard GPT adoption versus a gentec adoption, but you've got 10 times, 20 times the uh potential attack surface because now it's not just one model, it's models talking to other models. So you've got embedded models behind one another. You still need visibility of that because, as DJ also said, they're non-deterministic. So you need to be able to understand how that could affect your entire AI pipeline. Then you have to have visibility into the MCP and agent-to-agent protocols. You need to understand how the tools are being used by these. So observability as well becomes extremely important to be able to get a holistic view of your AI infrastructure in the Segentic architectures, right?

Identity and Data Under Fire

SPEAKER_01 10:18

Just just to add to that, if I may, sometimes the attacks are very subtle. So you may only see the effect downstream. And it's it's not that trivial to to understand what have happened in some cases, and you need to uh to to have the ability to identify those subtle things that happen downstream so you can correlate them back to something else that have happened someplace else. Now, if you have multiple agents, there's always the effect one agent might have over the other. There's of course the identity of who is actually requesting that agent to operate on its behalf, and it's it's more complicated than than a lot of people think.

SPEAKER_02 11:06

Yeah, and and again, Ofer on this this is like you and I have actually had some conversations in a completely different world. Uh, I think you were you're probably at force point when we were chatting about data security. The the thing is the world sort of like finds itself in like a very similar spot over again, right? I think you know the the identity now becomes important again. You know, the the you know the data that these agents are accessing becomes really important again. And and again, when we start to reimagine the security problems of what we used to worry about from a web perspective, all of that stuff is completely turned upside down right now because you now have brand new models of like reimagining how these things are going to get used. So I think I think we're in really interesting times. And I guess, you know, previously the commodity used to be, you know, what what can you, you know, it used to be PII and PCI. But interestingly enough, right now the biggest and most precious commodity are is compute. Is you know, people trying to leverage the GPUs that you have available to be able to do interesting things and attacks and whatnot. And so it's fascinating how when you think about the things that people go after right now, it's it's it they want to be able to use a prompt injection attack to subvert your GPUs to be able to do different things for them. And I and and and we're seeing this. You know, at Cisco, we we have this massive threat research team of like about 500 plus talus researchers that are looking at all the new kinds of attacks that are starting to emerge as well. And uh, and uh and AI is being used to attack AI. So it's starting to get really, really interesting when you start to think about what what is the realm of possibility that we're about to enter into.

Platforms Beat Point Tools

SPEAKER_01 12:39

I would I would go back even to the the more basic problem with which is the classification of data, right? I mean in in an enterprise, a regular enterprise, and I've I've done you know data protection for many years. Um, you know, data goes like this, right? It goes everywhere. It's like water. If people think that they can protect uh data, then you know, think again. It's it's it's really, really hard. And the real problem is the problem of classification. Now, if you haven't done those things yet, you you are in a world of hurt. Because if if you don't have the baseline of who can access what, then how are you expecting your you know, AI infrastructure, your agentic AI infrastructure, your agents to behave where you do not have the right controls of who can access access those that data. Now it's actually much more complicated because there are actually the identity here is composed of two different identities, like the identity of the agent and what the agent can do or allowed to do, and the identity and the identity of who is requesting the agent to operate on on its behalf, which is either a machine or or a user. That is the new identity that that we need to take into account, which is also means that we need to adjust the systems that we have today to that as well. And I would say that the sins of the past are coming to haunt us because the data protection problem is something that is it's it's not been solved. Okay, I've been I've been doing you know cybersecurity or information security for more than 25 years, and it's it's always had been there. So now with with with AI and the fact that we need to access that data in order to you know have those AI agents make conclusions on our behalf, it's it's it's it's much more its problem is much more uh compound by by those issues that we we still have.

SPEAKER_00 14:37

Yeah, and we've seen that, right? Within enterprises, you know, uh when they started with turning on a genetic search agents for enterprise search, right? And being able to optimize information retrieval, there was a lot of those cases where you know individuals could ask questions that was generally prevented due to security by obscurity, right? So nobody knew that file existed somewhere on a finance server or somewhere on the HR server. But now, because we have that broad vertical integration of the data with our agents, that data becomes available and at the touch of buttons. So tying it back to actual categorization and also certain level of guardrails that we can instantiate around those language models is extremely important, right? And it's again, those identities isn't even an either or. In certain cases, you may want to combine the two to be able to say a finance agent with this person requesting shouldn't have access to this. But for instance, a HR agent with this person requesting should have access to it. And that complicates the whole domain, right?

Leaving the Lab for Reality

SPEAKER_02 15:53

See, I think you know one of the core things is when you start to think about security first AI, it isn't about more tools. It's actually all about a platform approach. What you're thinking about when you start to think about this platform approach is that you start to reimagine your application, your agent, your infrastructure into a combined AI system, right? And so when you when you think about this as an AI system, you start to think about this as, hey, what how do I reimagine the lifecycle of this AI system that includes infrastructure, that includes development, that includes validation, production of like all of these tokens that the AI factories are going out and generating? You got to think about this as a complete holistic unit. So with Secure AI factory, security isn't just a feature that you add to your AI application or your agent. It's essentially the fabric that lets you scale the token generation capability about the intelligence that's coming out of your factory, you know, pretty much across all of your layers. And so you also have to make sure that each one of these capabilities sort of reinforce each other, right? So when you when you think about you know, from a model perspective, you know, you have to make sure that your models are appropriately validated. What do I mean by this? Model validation is essentially making sure that these models don't have vulnerabilities inside of them, right? So you're you're you're developing techniques to be able to algorithmically ret team and you know test these models to make sure there are no vulnerabilities. So then once you understand what those gaps are, because no models are gonna be perfect. And it's also really important to know that there are no CVEs established for models. These have billions, if not trillions, of parameters. So it's not easy to be able to say, hey, exactly, this is the vulnerability that you got to go fix, or you've got to patch this thing or remove this in a package. It is now in a very interesting spot where once you understand what gaps exist using algorithmic red teaming or validation, you then build your right types of guardrails that you could go out and enforce to be able to make sure that you know your AI system is protected, you know, end-to-end. And so, so so this is this is a this was a core thesis that we had when we said, listen, everybody's going to need an AI factory. There's no two ways about that. But then the question becomes do they, you know, is is a is is a customer or an enterprise going to think about purchasing an insecure or not so secure AI, secure AI factory, or a secure AI factory? And the answer is obvious, right? They they want a secure AI factory. And that and it becomes really important to provide to them a full capability, a full stack approach where you're thinking about model validation, you're thinking about discovery of those assets, you're thinking about guardrails, you're thinking about understanding the discovery of MCP servers, the tools, the resources that they're accessing, and most importantly, communication between multiple agents, you know, whether it's agents talking to other agents, agents talking to tools, you got to be able to protect the entire lifecycle. And that's really the approach that we've taken at Cisco.

SPEAKER_03 18:52

Yeah, Esteban, build on that. You know, it's it's clearly not a tools issue, as DJ mentions, and it's something where a platform is going to be able to help help organizations scale. You know, we're seeing a lot of this type of work in the AI proving ground, you know, the namesake of this podcast. How are we seeing that actually play out in the real world? What types of benefits is that platform, that secure platform approach, providing?

Securing AI at GPU Scale

SPEAKER_00 19:15

Yeah, I think one of the interesting things is that if we look at how AI's matured over the past five years, right? Initially it was foundation language models out in the internet that people could go and access. Then we started running them locally, and that was in many cases RD teams. It was innovators, it was the financial team that wanted to look for new fraud models, it won, it was the data science team that wanted to figure out how we leverage this on our genomics data or something like that. And it was very easy to protect that because essentially what you did is you segmented it off, you gave the 200 or 500 researchers access to that cluster, and that's where it worked. But over the past year and a half, two years, more and more expectations from the boards, from the executives of organizations is driving that. We need to make AI part of our production. It needs to start realizing ROI, not only in the research, but also in our daily productivity, in our integration and automation strategies, and even in our customer support, right? We see a lot of these general chatbots in support, but making these more and more technically savvy, better engaging with customers, et cetera. So being able to pivot from that RD environment that was very nicely segmented and had a moat around it to production, now suddenly there's huge amounts of integrations. So when we looked at taking some of those RD systems in some of our customers and moving them into production, it was very difficult because they did not think about the security effects of actually implementing these into production application, production workloads, et cetera. And then it came on laying security on top, which A slowed down the implementation and adoption, because now suddenly we can't just interact with the language models. We have to wait for guardrail platforms to be installed. We need gateways to be installed, we need to segment the network properly so that these workloads can actually get access to these models. And it slowed down. And I think if we look at the MIT report that said 95% of AI deployments isn't really getting the ROI, right? I think that has a direct correlation also to that security baseline that didn't exist in a lot of those RD environments where data sciences were doing amazing things, but it was segmented off. So I think taking a security approach from the beginning and at W. We leverage a framework called AI Readiness Model for Operational Resilience, ARMAR. And essentially, you know, taking that holistic approach, everything from not only your infrastructure, it needs to be infrastructure, it needs to be your governance risk compliance, it needs to be how you classify your data, it needs to be how you manage your models, right? And taking a holistic view of that as you start, it's going to accelerate everything, just like DJ said, right? If you start secure, you can start integration day one. But if you start insecure, you've got a long way to walk once you get to that point where you say, I've got something usable. Now let's integrate.

SPEAKER_02 22:45

And then sorry, I was just about to say that, you know, one of the really important things about how WWT has thought about this, which is really unique. And we at Cisco really appreciate this is, you know, the you know, it's a very systematic framework of thinking about how you govern before you build, how do you secure and build a supply chain? How do you protect at runtime and how do you observe and respond? Sort of the broad framework that deeply aligns with how we think about this from an AI security perspective. And I think what really differentiates WWT's view here is always about lifecycle first, right? It's not about point tools. It's about trying to think about, like, hey, security spans, governance, you know, validation, enforcement, and operations. It's a whole thing. It's a life cycle. And it has to be architecture-led. When you think about the emphasis is on validated, repeatable reference designs. When you think about the secure AI factory, it's not about a one-off approach, but it's about having a blueprint of how you go about executing this. And the most important piece here is that you have to be aware of your ecosystem and partner really well, right? It needs to work across different infrastructures. It's got to work across networking security and your SOC tooling and rather than a simple standalone AI control. And I think that to me is really, you know, it resonates deeply with us at Cisco about how we think about the world. And that's why this partnership starts to make a ton of sense.

SPEAKER_00 24:05

Yeah, and I think very important there is also, you know, you could use traditional security controls quite effectively to segment these models and these systems out of your main network. But once you start bringing security controls into your AI factory, into your models, they need to be applied in a far more robust way so that they can scale. Because you can secure a single machine with eight GPUs, that's no problem. But we're talking about Nvidia and Cisco stacks that are, you know, data center size, right? So you need to understand you're not going to have a you know, per stack 70 terabits per second firewall. You need to think about that differently and you know, drive down that risk, leveraging the capabilities of things like DPUs and you know, a software stack that's distributed and has that ability to scale.

SPEAKER_03 25:11

Of course, security needs to be baked in at every level here. I I want to get uh your perspective just on Bluefield 4. Is the idea there that you know that security is is also working and being implemented even like below the AI itself?

SPEAKER_01 25:27

So so even with Bluefield 3 with our current AI nodes and the current AI factories, we basically bake security in the infrastructure itself, meaning that you can take your next gen firewall uh and implement it in on the Bluefield card, which uh runs isolated from the host in its own thoughts domain, and basically apply the rules there and not allowing whoever is in control of the host to temper with the way that those policies are being applied. So here you go, you have you know micro-segmentation into the into the AI cluster uh done. You can also, of course, integrate some guardrails into that, you can uh into that firewall, you can uh when looking into inferencing, you can send some of that uh decision making through the auto-band port of that uh DPU to GPU-powered servers that will run other models that will let you know if there is an attack happening. And of course, you can integrate with uh Docker Argus. Docker is our software development framework for the Bluefield cards. Docker Argus is our real-time workload threat detection where we provide real-time cost introspection, hardware to hardware, memory to memory, without having any performance impacts over CPU or GPU, with zero integration in software to the host other than our driver. And we provide the you know situational awareness about what's going on with those AI workloads. So this is, you know, these are like three different examples of what you can do today with Bluefield 3s. Bluefield 4s, for those who are unaware, are six times the the uh CPU power of the Bluefield trees, so you can run even more things. We have some uh wild ideas that we are already starting to test. And these allow you to still uh have the best performance possible because you are now running some security capabilities inside a complete different compute that happens to be on the same host, happens to run on a separate uh compute, uh, but eventually it frees up the CPU and the GPU to do what they need to do, is and it is to generate as much as tokens as possible. There is there is uh real value here, and we and that had been demonstrated by some of our partners. And it's not just it's not just a security effect, it's also the fact that you also control networking at the same time, right? So you control networking and security, and those are usually the most demanding applications that you have when they are running on a regular host. So running those features and capabilities on the DPUs allow you to basically generate as much as tokens as you can, fully utilizing the hardware.

Resilience in Real Operations

SPEAKER_03 28:17

No, absolutely. Esteban, I think, but correct me if I'm wrong here, we try to operationalize this and kind of everything we've talked about through our AI readiness model for operational resilience, you know, which because everything deserves a great acronym, we call ARMO. Walk me through ARMOR and how that's weaving into the conversation that we've had thus far.

SPEAKER_00 28:36

Yeah, as I entered a little earlier, right? When we were in the original conversations with a number of our partners, Nvidia, Cisco, a number of others as well, you know, similar to what Cisco did with their framework, right? We saw the opportunity of saying, you know, let's have a look at where we need to define the known unknowns and the unknown unknowns. Because a lot of our customers were coming to us and saying, we're building out these large-scale data centers with AI compute, but we don't know how we're going to actually implement the security controls. Do we just leverage traditional firewalls? Do we still load agents onto the endpoints? So being able to break it out into six domains with the overarching domain of cyber resilience, right? We really look at each stage of that and facet of that AI workload that you're going to be building, whether it's hosted on-premise, and we're seeing many customers also taking a hybrid approach, right? Because you want, as we're moving workloads onto these AI compute platforms, there's going to be a lot of critical workflows that's going to be running on that. So you need a resiliency in your architecture. Because the kind of thing I always say is there's very few times that a thousand people in your company might become sick on the same day, right? But if your AI supercomputer cluster becomes unavailable, it's going to be equivalent to that, right? So making sure that you've got proper hybrid design with the resiliency minded, right? And taking that from the governance risk compliance so that you know where you can run your models, especially in sovereign clouds, we see that a lot of times because at the moment there's over 70 different countries busy building AI governance and the regulations. And depending on where you want to run those workloads, you need to be able to be aligned with that sovereign AI, you know, build out. And then in addition to that, you know, as I was saying, it's governance-risk compliance, it's the model protection, all the way from the training to the inference. It is the data, it's your infrastructure, everything from your identity to your, you know, controls layer, like Ophel was saying about the DPUs, right? We see that even expanding beyond AI by itself. And then your security operations, right? These AI compute data centers are going to be generating a lot more telemetry than your normal data centers because you're not just looking at events. You have to look at the performance, the operational capabilities of those hosts, making sure that they're actually running at that 99% that you want them running, right? So when performance drops, that could be security incidents affecting it, denial of service attacks, it could be poisoning attacks, all those kinds of things that you need to be able to inference your operation, secure operations teams need to actually understand. And then last but definitely not least is we see a lot of interaction at the moment when compute clusters like these are being built. There's generally software development associated with it, right? People are building out new applications that integrating. So making sure that you're taking a responsible approach to that secure development as you're actually integrating into your AI data centers. And we see, you know, with the kind of code generation explosion with the AI assisted code generation, the similar kind of code defense needs to also become extremely important to apply as you move forward at scale.

What Secure AI Really Looks Like

SPEAKER_01 32:25

Yeah, one thing that I want to add is that we didn't mention that, you know, this is something I've heard from Ishtagman as well and from others. Customers do not want to install anything on the AI node other than the AI software. And they want to make sure that they utilize the utilize their investment as much as they can and to the maximum. And that is definitely something we are hearing. Now, the problem that that we are seeing is that if you use the traditional security tools and you just use them on an AI node, that's that's like oil and and water. Because of the side effects those may have, and because of the CPU demands, and because of other things that simply are not are not a fit, where you need to do throughput and efficiency as your operational guidelines. And that is also important to say, because there is a reason why we chose to say, oh, we have this special smart uh card or smart network adapter that had that is actually a computer, that we rather have some of those functions lines there. It's not just you know uh not looking into what we are also being asked to to do, right? So definitely it's part of our vision, but we are definitely getting feedback from the market itself, from the customer, from the end customers, that this is something that they definitely would like to do because they still need security, but they still would like to maintain the best possible performance they can, because that's how eventually what we term as token economy works.

SPEAKER_00 33:59

In the AI proving ground, we actually tested some of those things. And, you know, if the exactly like you said, Dafir, the challenge is that it can take anything from five to 15 to 20 percent of that performance out of the loop. And when you're looking at customers that are buying, not based on their budgets, based on the power that they have available, performance is extremely important. You know, it's not always the case that they can just spend more money to get more compute. It's also about where they can host that, where they can actually even get power to drive those.

SPEAKER_03 34:41

Yeah, DJ, I want to get um your perspective on this because we covered a lot just in the last couple um minutes. We talked about Bluefield 3 and 4. We talked about WWT's armor framework, we're talking about market realities, we're talking about performance. Tether us a little bit to kind of the root of the conversation that we're having today around Cisco's secure AI factory. What does all this mean in comparison to what Cisco is doing with that solution?

Old Attacks. New AI Targets.

SPEAKER_02 35:04

Right. So here's the thing, right? I think you know, to Ofer's point about optimizing, you know, the the security solutions to the hardware that you have available, I think it's really important to understand that you have to be aware. Like there, there are, you know, there are STKs and frameworks that that NVD is releasing that that makes it easy to be able to run some of these things more efficiently and effectively on top of the compute that they provide, on top of these blue field chips. And uh in you know, at the Paris GTC, you probably heard Jensen talk about how AI defense, Cisco AI defense is now NIMS ready. It's available as part of the NVDS microservices, you know, spectrum of you know capabilities where we're basically saying, hey, listen, we're gonna implement these things seamlessly, where you can bring you know Cisco AI defense up as the NIMS microservices. You can have the GodRail's service run here, and and we're constantly optimizing these, you know, based off of the SDK that, you know, whether it's you know Morpheus or the whatever the latest name that that NVIDIA has for it, you know, on the SDK side, because it's constantly evolving, right? They're also moving at the pace with which the agents are moving, making sure that we're using those SDKs that leverage the libraries, you know, that that these devices export, it becomes really important. And I think that's where you get that efficiency. That's where you start to have those meaningful conversations with the customers where you say, hey, you're gonna get the bang for the buck in terms of what you bought from your AI needs. But at the same time, you gotta be able to make sure that security is running at the edge where it's supposed to run, right? Now, the the the key thing from a sovereign cloud perspective, you know, on the topic of how sovereign clouds are being set up and how lots and lots of projects are being spun up inside. And Cisco has a very clear point of view. We we announced our our ability to sort of support these sovereign build-outs in a meaningful way. And more specifically with respect to AI defense, we built this architecture in such a way that you have the ability to deploy this in a hybrid fashion where you can keep that data path close to where you need it to be, whether it is inside of your, you know, your AI factory, inside of, you know, specifically across your own colos, your local data center, or even for certain projects, you might need to have some of these deployed inside of a VPC because nobody has like a pure single AI factory. These AI factories are being split across, you know, hyperscalers, neo clouds, you know, sovereign environments in your own enterprise. So the ability to sort of have a hyper deployment with a single control plane that can sit inside of your sovereign point of presence. It can be inside of your sovereign cloud and pick your geo. It's within that geo boundary. But being able to then have a data plane cut across the board makes it extremely simple for a security operations person to be able to say, okay, I now have like a full view of all of my entire security deployments for AI security, and I can look at it from a single place. And uh, and I'm and and all of these things are optimized for the GPUs that you have deployed. I think it makes a huge difference.

SPEAKER_00 38:09

I think the optimization is huge, right? I I think one of the important things, similar to the kind of DGX platforms that Nvidia builds, you know, what's great about the Cisco, you know, secured AI factory is that you optimize across all the layers. So you're not just it's secure by default, but you're optimizing the network, the integration points, your OS hardware delivery, your even your power management across those things. So I think that's also a key space, right? You know what to expect when your secure AI factory actually is delivered at your doorstep.

SPEAKER_02 38:48

That's right, it's not you're smart on. I couldn't agree more.

Watching AI Make Decisions

SPEAKER_01 38:51

Yeah. I would like to go back a second and um just talk about the attack vectors. I mean, we jumped into a lot of of things, and the attack vectors you know stem from first the the regular cyber attacks or the regular stuff that we carry over. So if it's an infrastructure-related attack, right, that might be attacking your your containers or that runs AI workloads, okay? AI workloads operate in containers, or attacking your OS, or attacking your APIs, or attacking your MCPs, or attacking your A2As. These are all things that we are familiar with on regular data centers, and we pretty much understand how they operate and what to do, and that's also why we've decided to bake in real-time workload threat detection inside the DPUs because of the fact that it it takes much of the CPU utilization to do so. There is, of course, the how you verify that the agents are doing what they are supposed to do, and you have the right observability, as Ishtavan said rightfully so in the beginning of this discussion. And that is a very complicated thing when you look at uh agenda ki. You need to combine that different telemetry from different agents into a complete picture and understand whether something is wrong downstream. And of course, there is the protect the agents themselves from attacks that are, I call them application level. Uh, they you can call them through the prompt. You see that I'm not saying prompt injection, I'm saying through the prompt because there's a lot of different things that can happen there. So there's a multi-layered type of risk, right, that you need to approach in multiple different layers as well. So, as DJ said rightfully so as well, we at NVIDIA also provide and bake the ability to hook onto and to defend through our different SDKs, like the Nemo SDK, like using the NemoTrone SDK to generate uh data and uh static information to build on and to uh basically train those those modules. So there's a lot of moving parts here that eventually come together, and that's only on the uh securing AI, where we are also have the using AI for security type of stuff that is is is crucial at the same time, right? You're you're eventually the inferencing models that tell you if you have uh some sort of uh prompt injection attack or a prompt-related attack, runs on top of the GPUs, being trained on top of the GPUs. And that's that's uh basically what uh gives us that unique approach to look at it from an end-to-end uh perspective as well. So all of this needs to come together, and all of this needs to be part of an end-to-end solution. You don't always we are focusing on the entire picture because those data poisoning attacks are usually related to someone getting access to the data. How are they gonna get access to the data? Well, they go through the infrastructure on a regular attack that we all know how to do. How are they gonna try and attack MCPs? Well, through the same things that we have been doing for ages. So some of it is is old and a lot of it is new, and the new is is more complicated than the old.

The Ethics Line AI Can’t Cross

SPEAKER_02 42:09

And I know for the one of the most interesting things is that these attacks are now being crafted using just English language, right? It's not uh, you know, you don't even need to know uh complex code like you know to be able to figure these things out. And and then the interesting other part is that these, you know, you you really have to like observe these agents' behaviors very closely, right? Observability is such a crucial part of this because you know, because these agents don't fail loudly. You know, they don't show up with like a giant editor that just says, hey, here's what's happening. They start to drift quietly. And you know, you you have to sort of almost reconstruct the behavior from scattered signals and not just watch for like those metric spikes, because you know, these agents are distributed and they're stateful, which means you know a single outcome is often the result of many agents interacting across services, tools, environments, making it really difficult to trace causality from end to end. And then the next part of it is the telemetry is fragmented because each one of these agents are coming at different places. These tools are coming from different places. And then, you know, when you start to think about the intent, was it's the execution gap. Like here's the absolute intent that you have asked the agent to go out and do, but the execution is drifting so far away. You know, it might the agent might still be working as designed locally, you know, but it's still producing unsafe or unintended outcomes downstream, like you mentioned, right? And so this is where I think what becomes really important is that you know, you have to understand that these are fundamentally new types of attacks, like you said. And and then observability becomes incredibly important. And as part of this, you know, as we communicate our secure AI factory story, there's this really important ingredient inside of that with Splunk. You have the ability to sort of start doing these distributed ways of being able to track all of these things. We've been doing this for many, many years across lots of. Surface areas, but but you now have like the the the right types of tools to like track the observability of all of these agents and start to piece together a more holistic story and then help you put you know a very strong defensive position. So I just wanted to add that you know I'm 100% aligned you know with how you're thinking about this.

SPEAKER_00 44:18

Yeah, and the interesting thing is the vulnerability space and the data teaming space of these language models are more akin to social engineering than it was to anything like exploit development and stuff in the box. So you know, we're really trying to approach this from an English language perspective to suggest or nudge the agent to do things that it wasn't expected to do. And then, you know, we've also had the strange change that the trusted and ethical side of security of delivering outcomes with these language models have inadvertently dropped onto security's lap as well. Because there wasn't a lot of tools out there that was potentially bringing out an unethical outcomes, right? So now it's not only securing it, it's also making sure that these things are aligned with your organization, your regulatory ethics, making sure that it's not going off path, even from that perspective, right?

The One Thing That Matters

SPEAKER_03 45:18

Absolutely. I mean, DJ, I love the way you put that, you know, drift quietly. This is, you know, as we move into 2026 and even when we get into 2027 and beyond, these things aren't going to just it's going to drastically change how security teams um approach securing the organization. We are running um up on time here, so I am gonna end there. This is a fantastic conversation to the three of you. Thank you so much for for taking the time out of your busy schedules to join us here and talking about um a very important topic. As I mentioned, things are going to change. Cisco's Secure AI Factory with NVIDIA can certainly play a very important part in um keeping up to date and keeping up with those types of trends. So uh thank you to the again to the three of you. Uh, hope you all have a good rest of your day. Okay, what this conversation makes clear is that AI security is less about preventing failure and more about managing that drift. These systems don't break all at once, they change slowly, and by the time the outcome looks wrong, the cause is already buried. So the key lesson is simple security needs to be designed into the AI factory from the beginning. This episode of the AI Proven Ground Podcast was co produced by Nas Baker, Kerr Kuhn, Diane Devry, and Addison Ingler. Our audio and video engineer is John Knoblock. My name is Brian Phelps. We'll see you next time.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology