AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology
AI deployment and adoption is complex — this podcast makes it actionable. Join top experts, IT leaders and innovators as we explore AI’s toughest challenges, uncover real-world case studies, and reveal practical insights that drive AI ROI. From strategy to execution, we break down what works (and what doesn’t) in enterprise AI. New episodes every week.
AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology
The Shift from AI Pilots to AI Infrastructure
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
As organizations move beyond experimentation, compute has become a deciding factor in whether AI delivers real value or stalls before production. In this episode of the AI Proving Ground Podcast, WWT VP Neil Anderson, NVIDIA VP Chris Marriott and Cisco VP Daniel McGinniss talk about how organizations are rethinking where AI runs, how it's secured and how value is measured.
More about this week's guests:
Neil Anderson has over 30 years of experience in AI, Software Development, Wireless, Cyber, and Networking technologies. At WWT Neil is VP and CTO in our Global Solutions and Architectures team, with responsibility for over $16B in WWT's solutions portfolio. Neil advises hundreds of Fortune 1000 companies on their global architecture and technology strategy.
Daniel McGinniss is Vice President of Product Management for Cisco Compute, responsible for developing innovative products and establishing new routes to market for Cisco's multi-billion-dollar compute and SaaS infrastructure management portfolio, while driving new solutions and as-a-service offers with ecosystem partners in alignment with customers' most critical business needs.
Chris Marriott is the vice president of enterprise platforms at NVIDIA, spending the last 14 years advancing enterprise solutions. With a background in engineering, including 10 years in ASIC development, Marriott combines technical expertise with strategic insight to address the evolving technology landscape. Outside of work, he enjoys playing ice hockey and exploring the outdoors with his family.
The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.
Learn more about WWT's AI Proving Ground.
The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.
Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.
From Worldwide Technology, this is the AI Proving Ground Podcast. Right now, AI progress is usually told as a model story. What's new, what's faster, what's the next release. But underneath that momentum, at least for enterprise organizations, is a very real constraint that is more basic. Compute you can actually deploy, power, secure, and operate without it falling apart after the pilot. So in this episode, we talk about what scaling AI really means when the constraints are chips, infrastructure, data gravity, and the hard handoff from proof of concept to production. We get into why security can't be the last meeting on the calendar, why GPU utilization is quickly becoming the new board-level KPI, and why the next phase of AI may push more compute out to where the data is being created, the core, cloud, and increasingly the edge. To unpack it, we're joined by three experts who've been in the trenches for decades. Neil Anderson leads cloud infrastructure and AI solutions here at Worldwide Technology, where his team helps organizations move from experimentation to real operating systems. He sees where things stall once the slide decks are over. Daniel McGuinness runs Cisco's compute business, sitting right at the intersection of servers, networking, and security, where AI ambitions either become production ready or stay stuck in pilot mode. And Chris Marriott leads NVIDIA's Enterprise Platforms business, working with customers who are trying to translate unprecedented compute capability into something usable, efficient, and sustainable inside the enterprise. What Neil, Daniel, and Chris cover will change how you think about the infrastructure decisions being made today that quietly shape what your AI strategy can become tomorrow. So let's jump in. Okay, Chris, I I want to start with you because, you know, certainly I don't think it's any secret. Uh and I I just read an article or a blog post from OpenAI's CFO, talked about how compute, uh, among other things, compute being, you know, amongst the scarcest of resources when you're talking about AI. Not necessarily surprised anybody, but you know, as it relates to compute, what are the challenges you're seeing as it you know relates to organizations trying to scale AI adoption within their enterprise?
SPEAKER_02:Yeah, yeah, it's a great question. And I mean, the the the challenges range all the way from you know what you're referring to with like our our cloud partners at massive scale to get this infrastructure in. And I think you know, the way to to think about it is that that that that highest end is like you know, Jensen talks about our our five-layer stack now is a challenge to to this entire ecosystem. And so it's it starts first with you know the power since AI is like real-time application that has to process and respond, needs access to a ton of power. And in 2026, I'll just take a guess. Most of those data centers are already filled up with you know, infrastructure or plans are being built and all of those kind of things. And so I think, you know, first and foremost, like there's that challenge of just getting access to power and and liquid cooling and and uh the foundation. The second part of it is obviously like chips in in infrastructure where you know NVIDIA kind of sells that that technology part. And so it's no surprise. We're also you know, have a ton of customers all wanting infrastructure at the same time. And again, the the challenges scale all the way from our cloud partners and large-scale hyper hyperscalers, but then all the way into enterprise as well. And so getting access to those chips and stack chips and systems and infrastructure, building all that out is you know, the one one big challenge that we're working with the entire ecosystem, supply chain, you know, the crunch on CPUs, the crunch on memory, it's a it's a huge problem in enterprise and and cloud and everything else. And then you keep walking up the stack, and now you in those data centers, you have all of this infrastructure that you have to tie together, you have to orchestrate, you have to deploy all the software on to run all of these things. And so all of that data center infrastructure is another challenge. And then, you know, the next two really, I think, are the ones that enterprises are really working through right now is that the model layer of which models to use, uh, a hybrid approach of like frontier model versus open model, how you tie all those things together to build now agents that can live within your enterprise and speak to your data, pull intelligence out, that's that that ecosystem can be challenging. And so, you know, we're obviously here to talk about uh that as well today, but we we work across the spectrum to help partners tie together uh solutions from every provider to build those agents and those solutions so that the last layer, all of the applications then that will live on top of all of the first four stacks, can actually make kind of business critical impacts and everything. So I think you know, it it it results, it's it's definitely all five of those layers that enterprises are running into. And then, you know, there's all the challenges of data governance, data data sovereignty, how they manage that data uh speaking in and out of the AI and the infrastructure and all those kind of things. It is not for the faint of heart, and that's why we love you know our two partners here today that speak every day to those enterprise customers. So I think that's what we're we're seeing from customers and partners.
SPEAKER_03:Yeah, Danny, I want to get to you here in a second, but Neil, Chris unpacked a lot there. You know, he talked about that five five-layer stack. Are you are we seeing the same thing as we talk to organizations? And are there any specific areas of that stack one through five that are giving more fits than others?
SPEAKER_00:Yeah, one of the things that we do every day with our AI Proving Ground Lab is really help to answer some of those questions Chris talked about, like, should I use this model or that model? Should I, you know, how many GPUs am I going to need to run a given workload? What does the thermal profile gonna look like for this? So we answer a lot of those questions, including, you know, what is my storage and data layer going to look like? You know, and can I run my, you know, sort of my preferred storage vendor with the AI cluster that I want to run? Will that work? And so we answer those questions every day. So that's one of the things that we're we're here for. We have such a great partnership, I would say, with both Cisco and NVIDIA, is around being able to be and answer those questions and kind of move the, you know, move the process along, get people to their outcome that they're they're actually looking for. So there's a lot of decisions that have to be made on the technology side. And it can take, you know, if if you don't have a capability like our proving ground, it can take a long time to get all the answers that you need. What we're trying to do is really help accelerate that and get people to the AI workload outcomes that they're really looking for. So that's a big part of it. And then as far as some of the other things, like, yeah, I mean, Chris touched on it, like power cooling is a definite challenge for sure. And that can constrain things as well. But but once again, we have, you know, we have partners across the ecosystem of neo cloud providers and the big cloud hyperscale providers and colo providers. And, you know, we we help customers figure out what where are they gonna land it, I call it. You know, where are you gonna land your workload? If you can't put it where you want it in your own private data center, we have options. So we we try very hard not to let that constrain making progress on the customer deal.
SPEAKER_03:Yeah. And Danny, you know, let's let's bring in Cisco's Secure AI Factory here. This is certainly part of a series where we're covering a lot of different areas of the solution. As it relates to compute, how does this change? How does the Secure AI Factory change any of this equation or conversation uh if it does at all?
SPEAKER_01:It no, it does a lot. I mean, I I so I represent the server business to Cisco, so certainly I'm a little biased and I'm at the nucleus. And I think right now, you know, GPUs, because they're such a big part of the spend and driving a lot of the decision making, is where a lot of this conversation starts. But what customers quickly start to realize is, you know, yeah, getting the GPU stood up is hard because of the things that Chris and Neil were just referring to, but it's even harder to take a proof of concept and get it ready for production because the networking has to get figured out. It power and cooling has to be figured out at scale. It's not just what you need to get a proof of concept up, it's how this is going to look as we start to grow it and turn it into a no different than what data center teams have been doing for many, many years is thinking about how things scale. It's just exponential now. But then quickly you get into what do your data pipelines look like and where does your data sit? Is this are we training? And maybe that works in the cloud. But if I'm doing RAG, well, I'd probably have to be closer to where my data actually sits. Maybe that's even the edge. And so as you're planning for all of this, you you get to that point, and then you start to think, well, crap, this is production. Like, how do I make this a secure environment? Because I'm never going to get this ready through my security team if it doesn't have the right posture. And then how am I going to operate it? Because if my operations team says that I can't put an SLA on this, well, are we really ready to make that be a production application? And so the concept behind the secure AI factory, and you know, NVIDIA has really started this with the concept of an AI factory. We built on that at Cisco and turning that into a secure AI factory. But how can we get this stack as close to ready as possible? So then WWT can go take that and do the last mile customization because they're really the ones that are working with the customer and tweaking that, tuning that. So if you look at this evolution of like starting in NVIDIA, Cisco taking it and building out the entire stack of everything I just talked about, security, networking, compute, storage, getting that ready, and then working hand in hand with WWT to say, let's go get it customer personalized. That's I think where the value of that factory plays in.
SPEAKER_02:Yeah. And I'll just I'll just jump in and add, like, I think the way Cisco, you know, brought you know, the the secure part of secure AI factory into this, about starting, you know, it's security is not something you just bolt on after the fact. You you really have to treat it as kind of a day zero operation, kind of from the ground up. And we've done the same thing at the infrastructure layer where we're now building in like confidential compute into the hardware platforms as kind of just base security layer. But then Cisco takes that over with secure like uh AI defense as part of the the secure AI factory. That that that security part is critical and it helps customers, you know, make those decisions faster, get to deploy and workloads faster. Absolutely.
SPEAKER_01:Yeah, and I think to just to follow up on that, I really think the important piece, you know, there's a lot of customization. When you add each one of these layers, brings versions of firmware, versions of OS, versions of applications. And to get that all to be ready to even start to install your production workloads is is right now, it's not an easy feat. And I think that's where, you know, Neil, I don't want to speak for you, but I think that's where you really depend on us to do some. I mean, we have teams of people that are doing nothing but sitting in a lab making sure that that infrastructure, the OS, the security layer are ready and stable so that you guys can go and and take it and and and get it ready for a production workload.
SPEAKER_00:Yeah, absolutely. And the simpler that you can make that for us and our clients, the better, right? Like we can we can focus on what's the AI workload and what model do you want to use and the data pipelines and let's get that ready and not but let's actually turn it up. The other thing we spend quite a bit of time on, both Danny and Chris, is you know, our customers want to see, you know, return on their investment. So how do we get to the utilization of those GPUs that they want to see? And how can I schedule as many workloads on that as I can in a in an organized fashion so that almost like it reminds me of like a secure multi-tenant environment that they want, and in a lot of cases it is, is like, hey, I've got different lines of business I'm gonna serve with this large AI cluster. I need to make sure that you know they've got secure multi-tenancy, that I'm but at the same time that I'm leveraging those GPUs to maximum utilization. So that's that's another thing that, and but we would rather spend our time on those problems, Danny, to your point, than getting bogged down in the validation of this, you know, this storage firmware with with that compute firmware, with that GPU firmware. Like we don't want to be in that business. We want to be in the business of taking those recipe ingredients that are already pre-validated and then focusing on let's actually get you the outcomes and the workloads and get those going.
SPEAKER_01:That's right. I think it's a it's very much in a you know where the where the economies of scale. Because it's essentially, we're not talking about inexpensive platforms. So to your point, the GPU that is being underutilized, well, that's a tremendous amount of waste of capital. And then if you think about the resources that it takes, each one of these organizations, I think, has a has a role to play in the value chain in order to not make it be an unaffordable or unattainable project to go into fruition. And so we're all trying to kind of create as much value as we can and let them recognize or utilize our economies, the scale that we have and the position or the role we play in that in that value chain.
SPEAKER_03:Yeah, Neil, well, I mean, along those lines, I mean, what other types of key decisions have to go into this when we are choosing that AI compute platform? What goes through organizations' mind, what goes through our customers' minds, and and and you know, how are we advising them to work through those?
SPEAKER_00:Well, it's it's certainly a little bit daunting if they're not familiar with the technologies, right? And what I really like about the approach that both NVIDIA and Cisco are taking with the you know, the Cisco Secure AI and factory with NVIDIA is that they're they're they're they're they're essentially approaching our clients with familiar technologies. They know Cisco compute, they know UCS, they know how to orchestrate that, they know Cisco networking, they know how to orchestrate that, and they're very familiar and they have expert teams in that. So just that alone, I think, you know, removes a little bit of that daunting factor is hey, I I can get spun up on this pretty quick and not have to get kind of bogged down in in all of the things that are going on. But there is a lot going on. It is daunting, it's complex, it move it changes every day. The model there's new models being published every day, right? There's there's new ways of orchestrating this, there's new ways of virtualizing it. And it's still, you know, it's still daunting, but we we we help customers with that every day so that it's it doesn't have they're not there alone, right? We can we can step in and really help them get up to speed, help them virtualize, help them secure it, etc. And and by the way, I think the Cisco Secure AI factory is very unique because of that security layer. That just removes an entire set of objections from our clients. Where to your point, Chris, I think you made earlier, you get well down the path, and it's like all of a sudden somebody raises their hand and says, wait a second, like who what's the security look like? And what's the policy for this data that you're that you're shoving into the GPUs? And I think planning that from the start with things like AI defense and you know the the mesh firewall and hyper shield, like those things being baked in from the start just makes it that much easier for clients to to get to adoption and and get to scale.
SPEAKER_03:Yeah, absolutely. I know we've talked about, or at least several times you've mentioned data. Chris, you know, what what is gravity of data or you know, placement of data, whether it's edge, core, private cloud, so on and so forth. What what does that do to where we run our AI workloads or how do we think about that?
SPEAKER_02:Yeah, yeah, it's uh it's a super important question when it comes to enterprises, because you know, enterprises range, you know, across several verticals and several regulated verticals when it comes to finance and healthcare and you know, several other critical ones across even the government. And so it the the data, you know, and data gravity, as we kind of all talk about, I mean, that really defines where a lot of that infrastructure is going to be. Now, it's not saying I think, I think honestly, every enterprise is going to have like a hybrid approach where there are going to be parts of that where they might have got started in the cloud and part of those workflows need to reside there. But every one of the enterprise customers that we talk about want to bring those workloads back on-prem where they can manage the data and where the data resides, number one. And then number two, there's like this hybrid approach to the models, as we talked about as well. And so now where you have like model routers and you have kind of intelligent agents that can choose the appropriate size model for kind of like the let's just call it the token cost, becomes really critical now for all these agentic workflows because now the the model routers can choose like, you know, do I use an expensive, you know, frontier model from a cloud, or can I bring that on and make better cost decision when it comes to tokens with my data on-prem? Number one. And then number two, with a lot of that on-prem data, that business-centric and private data that they have, they want to take open models and train it with that business critical data, because that's really what sets one business apart from the next, is that, you know, in proprietary data that they want to use. And not to mention, now when you look across enterprises across different countries or continents, you have the concept of like sovereign AI, where you even have, you know, like neo clouds and NCP partners building data centers in region because they want to train or they want to orchestrate those, that, that sovereign data in the region where that intelligence is created. So I I think the the data part of where it resides is foundational to you know an enterprise's decision of, you know, can I can I keep this part of the workflow in the cloud or bring it back on-prem where the rest of my workflow lives and potentially that has benefits from a latency perspective as well.
SPEAKER_01:Yeah, actually, if I mean if I could even build on that one one step further, we recently introduced what we call our new unified edge platform into the market. And I'd say that the edge is becoming equally, if not even more important in certain instances, as we're looking at pretty much every major industry: retail, manufacturing, healthcare, life sciences, our government, all starting to reinvent what the customer experience looks like or the worker experience looks like in the edge. And the the it kind of the way I I think about it and talking to customers so much about this over the last six, nine months, is that there's so we've been collecting so much data at these edge locations through cameras, through sensors, through devices that are being installed. But we haven't until recently started figuring out some of the more interesting use cases and things to do with that data. But now you've got so many different software companies and ISVs coming in with solutions. I mean, I get blown away by the solutions I see on a on a daily basis. Like somebody has this new idea. It's like, hey, we can unlock that data and do something really cool with it. I mean, holograms are going into retail stores to start having real agentic output, you know, conversations with customers. So as you see that happening, the data gravity is even more important because in order to deliver that realistic, real-time conversation that's happening, it doesn't have time to traverse even a even a short WAN circuit, let alone the bandwidth that would congest that WAN set WAN circuit so quickly. So and Neil, I know you guys are doing a ton of shit.
SPEAKER_00:Yeah. And this is the way I talk about with clients, Danny and Chris, is you know, I think up until now we've we've sort of been in this AI world that I call client server, right? It's like you you fire up a chatbot, you talk to the AI cluster, you get responses back, right? And it's all nice and tidy. And chances are that that rag model was built on like you brought a bunch of data to the AI cluster. So you're bringing data to AI. I think this is gonna really start flipping on its head, and I call it bringing bringing AI to the data, right? Is when you look at agentic AI and you look at physical AI, computer vision cameras, digital humans, robotics, and even just agentic AI, where you're gonna have, you know, I think you mentioned it, Chris. Like you have today, it's like you're probably your early yesterday, your RAG model was hitting one model, right? One big, you know, foundational model. In the future, very, very quickly, you're gonna be these these agentich applications are gonna be hitting hundreds of models. It's gonna go to the right source for the right information with intelligent routing of that, of pieces of the prompt and things like that. And so I I think that this is really about to flip on its head. And that's where, you know, Danny, the edge platform that Cisco released, I think that's gonna be critical. The the architecture, the future architecture to me for AI is gonna be yes, you're gonna have a big central cluster, but you're also gonna have these distributed environments all over the place that have to bring the compute to where the data is is being generated.
SPEAKER_02:Yeah. And and to to build on top of that, like some more is. That you know, it's not only with the the move in the past, you know, let's call it the last year to these new agentic workflows, it as much as exactly what you were saying, is kind of like this prompt and response of where we started. Now, now these agents, and I think this will continue to grow. And why I think Cisco's unified edge platform becomes really critical, is that every one of these agents now are just not like processing and generating intelligence and insight from data. But now we're asking them, and we're going to be asking them to take action and to make decisions and all of that. And when you bring that also now to the physical world in terms of, you know, edge retail, healthcare, customer experiences, robotics, uh, like all these areas, exactly, like to be able to take action, you just you won't have time to go back to a mainframe or a central location. It's these models are going to be so intelligent and are going to be able to reason with what the decision making they're they're undertaking versus the action they're going to take. They they're going to be taking that on you know on the edge as well. And so I think that's yeah, definitely why that platform is critical.
SPEAKER_01:Yeah. So Chris, I mean, I think it's a good point what what we just talked about. And what I'll build on that even further, and it's whether or not it's it's moving out of the cloud into multiple data centers. The problem even gets more exacerbated when you move to hundreds, if not thousands, of edge locations when you think about how you manage the infrastructure at scale. There's a reason in our industry, you know, we see the pendulum kind of swing back and forth every decade or so to decentralize a centralized attack. We're moving back to a decentralized model. Our belief, and I so I think all of AI, I would say the operating model is very, very important in order to make this be attainable and real. And when you start thinking about being able to operate infrastructure in thousands of remote locations and how to not have expensive truck rolls, and how to make sure that you have configurations that are secure and are standard blueprinted, the way that they should all be identical to one another. That's really, really critical. And so we're our we really believe it's Cisco in building the right tooling and the right automation and the right orchestration to help uh deploy this at scale. And I think that's what's actually making edge be real.
SPEAKER_03:Yeah, I mean, I love that we're bringing up, you know, agentic or or physical AI. Neil, we probably touched on it a little bit, but you know, what decisions are we making or are organizations making today that are kind of determining whether or not we're even participating or engaging with those workloads in the future when things are kind of primed and ready to go? Like what's what what decisions are we making today that might be quietly determining what the future holds for an organization?
SPEAKER_00:Yeah, and I think it really depends on the organizations and their maturity, how far along they are on their use cases and thinking it. Some people are still figuring out the basics of rag models and you know, you know, enterprise search and productivity tools, right? And just getting started with that. We have other clients that are much farther along, and you know, robotics seems a little sci-fi, but I'm telling you, we have real clients that are trying out robots in their environments and using AI for those. We have real clients that are doing computer vision at scale and using that information in in different ways, whether it's safety, whether it's productivity, a retail customer trying to understand customer behaviors. Like there's just a ton of use cases there. And so this stuff is real. We we talk about you know, physical AI a little bit futuristic, but I'm telling you, like it's it's it's there. And that's why I think that, and by the way, agentic is real, it's not something that is future. We we are building agentic applications every day for our clients and ourselves and using that. One of the most popular use cases we have is we call it agentic ops, which is transforming someone's network operations center with a with agentic AI to help them get a handle on the thousands of alarms they get and take action on some of those that are, you know, they've resolved it a hundred times. Let's take the human out of the loop on some of those and and we're just go ahead and resolve it to your point, Chris, about being able to take the action, not just reason, but and so we're building it's real every every day. And so some of the decisions customer make, and back to your question, is they are thinking about that future. It's a it, you know, when we when we lay out the use cases, that's the first conversation we like to have with customers is you know, what are you trying to achieve with AI? There's a lot of pressure on organizations to make progress from their board, from their CIO, from their CEO. Like, what are we doing with AI? Let's make progress. But what we're helping do is, you know, getting them a little bit thinking about the future, like that future state architecture of distributed AI clusters. I give a lot of talks with our clients around like, hey, this is the this is what you're dealing with today. Once you get that under belt, though, you got to have line of sight to what's gonna happen out here. And I think that is causing clients to take a look at, particularly in enterprise clients, like, okay, I I'm gonna have to scale this to Danny's point. I'm gonna have to scale this perhaps to 100 locations. What does that look like network-wise? What does that look like orchestration-wise and security-wise? And who am I gonna trust to build that out? And and that is causing clients to absolutely take really, really close looks at the Cisco, you know, Secure AI factory with Nvidia's because those things, those things are, those concepts are kind of built in. And and so that that is that is something that we talk with clients every day and and and what they're looking at.
SPEAKER_01:And with the breathing environments, it's not just how do I get it up and running the first time, it's the lifecycle management and the maintenance of how do I keep this secure over time, how do I operate it at scale, how do I gonna manage the the upgrades and the changes that need to happen. And I think that's that all also very much contributes to a lot of the help that we all collectively do in order to in order to meet the needs of what our customers are trying to achieve.
SPEAKER_02:Yeah, and I just just to add on to it, the I think you know, some of some of the things that you know your your companies are already doing with these enterprise customers are starting like planting the seeds about planning for the future, because there's a few fundamental things in my head that some of these customers need to just start considering, right? Like number one is you know, the data that they have in their organization isn't just generally ready on day zero to just give access to an agent or to AI to be able to process. So there is some, you know, do I what what what is my critical data that I need to take out of cold storage and put into hot storage, for example, to have the AI have like fast access and high bandwidth access to to gain insight or make business decisions. And so planning, you know, at in early stages about data and storage is definitely like a really critical one. The second one is I I I I see, and I I'm I'm also this is this is myself as well. If you if we just go back two years ago, you know, I think this new model like horizon or ecosystem of like there are like 40 different models at any one time kind of being released within a quarter, kind of thing, right? And new versions of the model and everything like that. So as these organizations start to become aware of the advances, the accuracies, the cost of different size models and everything like that, become start to become second nature. And you know, they can start to work with our partners to make good decisions about like tokenomics, about which models to use, which ones give them the proper accuracy of choices and everything. And then I think the third part are the pieces that Cisco is now integrating into that AI factory. So there are fundamental things that we just look at the oversight on. So things like NVIDIA AI Enterprise, we do like basically daily scanning the entire repository of open models that are out there. And so like open models are built upon, you know, hundreds of open source libraries. And so we're constantly scanning and patching and be able to support all of the open source libraries that make up these models and workloads. And then we give like fundamental kind of security to Cisco to be able to offer to a partner or to a customer. So there's just parts like foundational parts of that secure A factory that you know deals with the open models, deals with security, deals with performance and everything else. And once customers start evaluating that, they're doing a lot of the fundamental things that will allow them to build for AI into the future as well.
SPEAKER_03:Yeah, Neil, is that all part of you know, kind of good healthy GPU optimization here, or is that another maybe a fourth part of you know what Chris is talking about?
SPEAKER_00:Yes, some of it is a part of it, you know, and that's what we help clients with. And I and I think Danny mentioned it as well. Like, you know, the day two operations of this, it's not a you know, it's one thing to get a cluster installed, an AI factory installed. It's another to operate it on a daily basis and and monitor utilization and the virtualization aspects I mentioned, being able to schedule multiple workloads from potentially multiple clients in your business onto that infrastructure. You know, that's an ongoing process that we spend a lot of time, you know, kind of teaching teams how to do it because it is different. You know, the concepts are similar. If you think about you know, data center architecture in general, the concepts of virtualization and you know, scheduling workloads on top of that are not new. How do you secure it, how you segment it into multi-tenancy? Like these are concepts that have been around quite a long time, but they're new for AI infrastructure, right? And so how you do that is different. How you do, how you apply the tools you use, those are all different concepts. And so we we we that's one of the things that that we do is every day, is it's probably one of our most popular engagements right now is to once we get them to sort of understand, like, look, this is where the data is gonna live, these are the models I'm gonna use, these are the use cases I'm trying to bring to life, here's how I'm gonna secure it. The next step is then okay, how do I operationalize this and get the outcomes that I really want? We actually have a very large client together that with NVIDIA that that put it put it right in there, you know, right in the requirements of the project and said, like, you have to guarantee that I'm gonna get you know 80% utilization of our GPUs within six months. And so that's that that takes some tuning and tweaking and and you know, and teaching clients like how how do you actually operationalize that and and get that accomplished?
SPEAKER_03:Yeah, well, Danny, I mean, maybe build on that. I mean, answer Neil's question. You know, what is some of that operationalized uh playbook? Is it is it baked out fully or is are we still kind of exploring as we go?
SPEAKER_01:I think we're gonna be exploring for like I think there's a lot baked out, but I also think we're in constant exploration mode because of how much it's changing. For me, I'll tell you, I kind of maybe hit it at a little bit of a higher altitude and then cut bring it back down. The biggest change that I've seen in the last year has been two years ago, a year ago, people were saying, Hey, how can I get as many GPUs as I can get? And there was a sort of a blank check being written. What has changed more recently has been how do I get the best ROI out of this environment. And so, because now people have learned more, it's it's not just this is about where we're gonna make this real, I'm gonna try to do something useful. It has to have value for my company, and there's a there's an ROI trade-off. So I think what that has done for all of us sitting here, we all play a piece in this, has been to get better at sizing, to get better at consulting earlier on in the project and saying, you know, maybe you don't need a B300 to go do a small inferencing model that we could do on RTX 6000 Pro. And there's a lot of education that goes along with that, and helping people understand how to start what expansion is going to look like if and when the project truly takes off and becomes becomes you know more pervasive across the larger environment. So uh so yeah, so I would say tying a business need, and this is the pretty much the way the whole world works, is tying a business need and a business problem with a a properly sized technical solution that brings value back to the organization. That's what's changed for all of us now. I think it's a good thing actually for the industry.
SPEAKER_02:Yeah, and then I'll just add that fundamentally that's what uh we we we've now been moving to for with our enterprise reference architectures that we work with Cisco to you know scope and build and size is we give the uh the starting points or the building blocks of you know either RTX Pro servers or our HGX infrastructure, even all the way up to our our like Blackwell NVL72s. We give starting guides of, you know, do you want to get started with a four GPU server, but have flexibility to build that into a 32-node system, you know, and and give them sizing guides based on what model, what application, you know, RAG, or if it is it agentic workflows they're looking for. So fundamentally, that's we're trying to make those decision points easier and have it more accessible. But it's really critical to have, you know, a proving ground where customers can come in and experience real world workloads on the infrastructure that they're targeting. So, you know, this is a three-way partnership that's really, really important.
SPEAKER_01:I'll even take it one step further. Is once you get done with the advising and the consulting in the very early pre-sale stages, then it's how are you making it easier to actually order that solution? Because there are there are it's a very complex lot of not comp it is complex, but there's a lot of components in that stack. And I think, Neil, one of the things we try to do for you guys is put those solutions catalogs together so that when we do a validated design in a solution, now you've got a set of PIDs or a single SKU that you can order and to just to try to increase that velocity because there's a lot of time involved in this.
SPEAKER_00:Yeah, that makes it that much easier to your point, Danny. And and one of the things that we also do with our proving ground is while that is on order, and you know, there are some lead times sometimes. We're all dealing with supply chain, you know, challenges of of various types and and and constraints. But while that while that's on order and we're preparing to install that equipment at their data center site or where it's going to be installed, we we actually jump into the proving ground and say, let us teach you now. Like you don't have to wait for that to arrive and get installed to get going. We can start now on prototyping and as well as learning how you're going to operationalize this. And so we're doing that for a number of clients right now, is they've already placed their order for infrastructure, but we're jumping in the lab with them and let's let's go ahead and keep making progress on so when that is installed and ready to go, you're ready to turn on the workloads and go. You don't have to wait for that.
SPEAKER_03:Neil, I want to stick with you here. I we we we only have a few more minutes left in in this episode, and the three of you have been gracious with your time. We've maybe touched on it just on the surface right now, but uh power and cooling. What is Cisco Secure AI factory uh with NVIDIA doing right now to address some of those areas? And what do you expect to happen for the foreseeable future, knowing that this is going to be a very, very important issue as this all keeps going?
SPEAKER_00:Well, I I think you know it is definitely an issue, but like I said, we we do have solutions, you know, for that. We can find places to host workloads, right? There's a lot of options out there. There are the hyperscaler providers, there's all the neo cloud providers, colo providers. And if if it so happens that you need to go ahead and upgrade your own private data center, we've got really, really good people that have done that for many years to know how to help you design, help you get that infrastructure uplifted to where it's going to be for whatever you're trying to run there. Some of our customers have gotten creative and said, you know what, I'm going to move my some of my traditional workloads out to a colo environment to free up some of that power and cooling because I really want the you know the AI cluster to be in my private data center in my own, you know, my own property. And so there are things that we can do today to alleviate that pretty, you know, relatively easy.
SPEAKER_01:And I think that I kind of come back to the point I was starting to make earlier, in addition to everything that Neil just said, I also think there's an education with customers. Like I'm very surprised, especially if a customer was maybe doing proof of concepts in the cloud and they were working off of older GPU that they were able to get, you know, previous generation GPUs. There's such Christian, you probably want to comment on this, but there's so much increase in performance on every generation. I mean, it's amazing when you hear you guys talking about 10x performance on Ruben compared to whatever, plus it's performance per per watt too. So when customers go to build on-prem and they're actually buying latest and greatest, sometimes they think they need the largest and they really don't. And so, you know, we're saving them. It's not just the money, it is the ability to have the power and the cooling that can support their needs. And I so I think right sizing the environment is is really, really important and helping customers understand the application that they're going to run and the realities of what it requires. And quite often it's a lot less than what they require, especially in the smaller like inferencing and racked or fine-tuning use cases.
SPEAKER_02:Yeah. Yeah. Well, I completely agree. You know, the the the rate of improvement, I'll say, of like our accelerated computing portfolio is just like it's hard to comprehend. If you look at today, you know, we've announced that our our new Vera Rubin platform, which has started, you know, obviously at the high end with our NVL72 platform. If you look at the the tokenomics, as I like to call it, is 10x improvement or reduction in token costs over Blackwell. And Blackwell was already a 10x decrease over Hopper. So just in you know, two generations, we're talking a 100x decrease, you know, our largest platform from a token cost perspective. And there are several things, as Danny alluded to, that like contribute to that, right? If we look at one of the most powerful ones, is NVIDIA has a data format that we call NVFP4. It's a floating for floating point for kind of way to represent the data, but it's a there's a NVIDIA specific version of it. And what it does is it allows us to both now infer and train even at the same accuracy as a standard FP8, like a data level of accuracy, but you get kind of almost like a 2x improvement in memory capacity and size of that model when it's being held in the infrastructure. So there's several things on the software side that that play into that efficiency that, yes, if you're looking, if you're comparing to a GPU that was four or five years old, you probably don't need, you know, a fully liquid-cooled rack to accomplish what you're doing. But I I love what Neil was saying is that if you have, you know, in the kind of the new accelerated computing mantra, if you have workloads that are uh consuming power in your data center, that you have the opportunity to move to a colour move off-prem where you can take advantage of your data now that resides on-prem and you can bring in new servers and infrastructure, uh, it's a great trade-off to make in an in an organization where really it's like I would say that power is really the core thing that we're trying to, you know, use in the proper computer metric. And that's why we have, you know, basically three types of infrastructure that customers can choose from, all the way up to a Vera Rubin, you know, where we're at kind of 200 kilowatts per rack now for that peak performance across 72 GPUs. But we scale that down now into HGX, white-sized chunks of eight GPUs, or the flexibility of our enterprise platforms from Cisco with UCS, where you can put one GPU or two PCIe GPUs and get to a 3000 watt or 2000 watt platform that fits in these legacy enterprise racks. And so that flexibility and efficiency across a spectrum, all the way from PCIE cards or the highest end, give customers a lot of choice and options when deploying.
SPEAKER_03:Yeah, that's awesome. Uh so we have you know just uh a few minutes here. I I do want to ask a little bit about the the future. So, you know, as a within the confines of compute, we're talking today about efficiency scale, we're talking about it supporting agents that in a very much a today discussion and kind of physical AI moving forward. If we have the same conversation at this time next year, what is going to be defining um our our conversation? I know we still have GTC to get through, we still have Cisco Live to get through this year, and I know you can't necessarily you know divulge any of those details, but what's gonna be kind of the predominant predominant theme as we hit the end of this year or even the beginning of next? Neil, we can start with you.
SPEAKER_00:I I think this year will be the year of agentic. I I think that we're in early stages there. There's very few organizations that actually know how to build agentic operations at scale and agentic applications at scale. We've done it. It's taken us, you know, a couple of years of learning to get to the point where we're at. And we know how to do it. And we're delivering it for clients now. But I think that that the model there, the architecture of agentic and the idea of breaking a very complex problem down into smaller chunks, and then being able to go to the right model source to get that the right answers, and then bringing that back together to Chris's point earlier, being able to take some action upon that. I think that's going to become very real in at a wider scale than it is today. There's like there's a few out there, but I think it's going to become a lot more mainstream. So if we're having this conversation next year, I would say we're talking a lot about you know how scaling agentic and how how that experiment has worked out.
unknown:Yeah.
SPEAKER_03:Danny, you got you want to build on that or you got something else that you're excited about?
SPEAKER_01:I'll take it a different way. I think it's some of what we've already touched on. I mean, agree with everything that Neil just said. I think that in the enterprise space, especially, this year is about doing. Last year was about evaluating, trying not to be necessarily the canary in the coal mine, learning from what others in that industry are doing. And I'll I think I'll double down on edge. I think that more and more you where some of the most proud, like we've we've gone through the everybody's got their their own version of ChatGPT installed that's you know serving the company's needs and solving HR problems and databases and asking questions. So as you move into an agentic world, right now it's there's so many applications and use cases that are turning up for how to change the customer experience, how to change the employee experience, how to do more with data that's being generated by humans, whether that's through cameras, through voice, through sensors. I I see a lot of really interesting things happening that are human interacting with humans in places that we in our personal lives visit.
SPEAKER_03:Yeah. Chris, you're gonna go three for three on agents, or you got something else up your sleeve?
SPEAKER_02:You know, I I think it's uh I think it's a combination of all the above, right? 100% agree that like agents and agentic AI is going to become pervasive this year. Like it's going to hit most of the business operations of all the large companies around the world that they're already exploring. We already have Neil and team helping to deploy that for these customers. So the agentic part and taking actions based on that data is going to be critical. I think the other place that we're really going to see is how AI is now playing a role in the different verticals and vertical workflows, kind of uh we would say like bringing accelerated compute into a lot of these new verticals. Some of the key ones I know are happening right now are like in the engineering spaces. We've partnered with several companies from Cisco and Cadence and AncyS and Siemens to now integrate AI into traditionally, like let's just call it like CPU-based workloads and to bring you know new efficiencies and new cap uh capacities, cap capabilities to their platforms as well, which basically leads into the physical side of AI. And I think kind of like agentic this last year, where everybody was experimenting and getting up to speed on agents and AI, I think the next step absolutely is going to be how AI will intersect the physical side of the building. And we see so many humanoids now under development and and everything else. But I think how we build digital twins, how AI will help to make decisions on traffic and weather and other parts of these verticals. I think that's that's the year of 2026.
SPEAKER_03:Yeah. Fun times ahead for sure.
SPEAKER_00:And Brian, I'll I'll add one that I hope we're not talking about next year, which is the AI bubble. Yeah, yeah. And and ROI and how many companies are not getting ROI. Like I see the clickbait every every day. And my colleagues and I just like we we we study that and we're like, is you know, that's not what we're seeing our clients. Like a hundred percent of the projects that WWT is working on with clients are successful, you know. So if you're if you're not part of that, then you know, you can you can you can seek out help. But I hope this time next year, we're not talking about an AI bubble. This is here to stay. It's almost like asking at this point, is was there looking back, was there ROI created by the internet? I mean, of course there was, right? Like it's a silly question, but I think we're gonna have that same kind of phenomenon occur here. We're gonna look back in five or ten years and go, why were we even asking about the ROI of AI? Of course there's ROI.
SPEAKER_03:Yeah, and certainly a lot of the work that the three of us, the three of our companies will do over the course of this year and far beyond will help you know put that AI bubble talk to uh to bed. Uh to the three of you, thank you so much for joining on the podcast today. Hopefully we'll run into each other at the likes of GTC's Cisco Live and and so on. Um thank you all again, and we'll have you on again soon.
SPEAKER_01:Thank you for having us. See everybody.
SPEAKER_03:Okay, thanks to Daniel, Chris, and Neil for joining. As this conversation makes clear, the organizations planning for power, security, operations, and utilization from day one are the ones making the biggest advancements. They aren't just chasing more compute or betting on the next breakthrough model. They're taking a disciplined approach. This episode of the AI Proving Ground Podcast was co-produced by Nas Baker, Kara Kuhn, Diane Devery, and Addison Ingler. Our audio and video engineers, John Knoblock. My name is Brian Felt. Thanks for listening. See you next time.
Podcasts we love
Check out these other fine podcasts recommended by us, not an algorithm.
WWT Research & Insights
World Wide Technology
WWT Partner Spotlight
World Wide Technology
WWT Experts
World Wide Technology