Why Private AI Is Winning in the Enterprise Artwork

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology

AI deployment and adoption is complex — this podcast makes it actionable. Join top experts, IT leaders and innovators as we explore AI’s toughest challenges, uncover real-world case studies, and reveal practical insights that drive AI ROI. From strategy to execution, we break down what works (and what doesn’t) in enterprise AI. New episodes every week.

All Episodes

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology

Why Private AI Is Winning in the Enterprise

March 18, 2026 • World Wide Technology: Artificial Intelligence Experts • Season 1 • Episode 72

0:00 | 44:25

Private AI is quickly moving from niche architecture to a core enterprise AI strategy. As AI workloads move into production, leaders are rethinking where inference runs, how data stays governed and why hybrid AI infrastructure may offer the best balance of performance, cost and control.

In this episode of the AI Proving Ground Podcast, VMware by Broadcom’s Chris Wolf, NVIDIA’s Chad Olds and WWT CTO Mike Taylor explore why private AI is gaining momentum across the enterprise. What began as a design choice for highly regulated industries is becoming a practical model for organizations that want to scale AI while maintaining governance, protecting intellectual property and managing infrastructure costs.

The conversation examines how inference workloads are spreading across cloud, on-prem and edge environments, why enterprises are becoming more protective of their data and how the economics of sustained AI workloads are shaping infrastructure decisions.

You’ll learn:

• Why private AI is emerging as a practical enterprise AI architecture
• How hybrid AI environments balance cloud, on-prem and edge workloads
• Why data governance and IP protection are driving AI infrastructure choices
• How inference economics are reshaping enterprise AI strategy
• What leaders must do to treat AI as operational infrastructure—not a side experiment

If you're building an enterprise AI strategy or scaling AI into production, this episode offers a clear look at how organizations are designing infrastructure to support AI reliably, securely and at scale.

Support for this episode provided by: Eaton

More about this week's guests:

Mike Taylor leads WWT’s Global Engineering, IT and Services organization, helping position WWT as a single-source partner for digital transformation. He connects WWT’s deep technical expertise with business strategy to help customers adopt emerging technologies and drive meaningful outcomes.

Chad Olds leads software sales and customer success across the Americas at NVIDIA. With more than 15 years of experience in AI, data and cloud transformation, he has held leadership roles at NVIDIA, IBM, Anchore, Software AG and startups, and began his career at Red Hat driving open-source adoption.

Chris Wolf leads the Private AI business for Broadcom’s VMware Cloud Foundation division, overseeing AI strategy, architecture and engineering. He works with enterprise leaders to design and scale AI infrastructure that balances performance, governance and cost.

The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.

Learn more about WWT's AI Proving Ground.

The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.

Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.

Why Enterprises Are Rushing to Private AI

SPEAKER_03 0:00

By 2028, nearly half of large enterprises will move AI into private environments. Not for ideology, for survival. Because once AI moves from pilot to production, the stakes change. It's no longer about generating tokens, it's about protecting data, controlling cost, and making sure your digital workforce actually shows up to work. So in today's episode, we'll unpack why private AI is accelerating and what it takes to run it at scale. The conversation will cover the economics of inference, the operational reality of agentic AI, and why hybrid architectures are becoming the default, not the exception. Joining me on the episode are Mike Taylor, CTO at Worldwide Technology, Chris Wolfe, Broadcom's global head of AI and advanced services, and Chad Olds from NVIDIA, three operators in the middle of what enterprise AI is becoming: hybrid, production grade, and cost accountable. This is the AI Proven Ground Podcast from Worldwide Technology. Today's show starts with the shift itself. Why now? And what's really driving enterprise leaders to bring AI closer to home. So let's jump in. Okay, well, gentlemen out in uh virtual land. Thank you uh for joining. Chris and Chad, how are you both this afternoon?

SPEAKER_02 1:14

Doing well, doing well. Appreciate you inviting us here. Anything anything with WWT and Broadcom is always fun to always fun to join.

SPEAKER_01 1:23

100%. 100%. That's what I uh that's what I said to myself in kindergarten. I'd be doing right about now.

The Hidden Forces Driving Private AI

SPEAKER_03 1:29

Love it. And Mike, thanks for hanging out with me here in the studio. How are you? Doing well. Wouldn't miss this. This is gonna be great. Excellent. Chris, I do want to start with you. So, you know, I was reading an IDC report before we started recording here, and it's it showed that by 2028, you know, some 40% of large enterprise organizations are expected to shift their AI workloads to private environments. Now, I'm wondering if we can just start with the why. Is this is this something that's more about mitigating risk, or is it more about um, you know, actively pursuing more of a value stream? Why are we seeing this shift right now unfold?

SPEAKER_01 1:59

Yeah, I think it's it's both, actually. It's it's certainly you have uh risk, you have compliance concerns. There's a lot of legal gray area in in some of this space today where, like to give you a specific example, you take any standard NDA that a company would have with a partner or a customer, and there's going to be some clause around non-disclosure to third parties. Now, if you is so where where does a public cloud LLM sit in that? If you don't have full visibility of the chain of trust and you can't prove that via an audit, then that's a third party, right? That's how a lot of our customers are interpreting that clause. And some are being a little more bold, but there's a lot of others that are saying, you know what, legal precedent isn't clear. I don't want to be the one to set the precedent. So, you know, you have the privacy concerns that are absolutely legit, but the other side of this is cost. When you're owning or leasing infrastructure, you don't have to build your own data center if you don't have it, right? If you just even lease capacity where it's capacity you own, the pricing dynamics shift as well. Like you're you're paying now a flat rate in most cases for the infrastructure. You're not paying for the token anymore, and you can fully optimize it. So we see cost reductions anywhere from 30% to 50%, even higher for AI inference. So it's not saying you're going to use this for every use case, but when you're thinking about production AI inference, where quality and control matter, right? And cost, then this is absolutely what we continue to see as that trend.

SPEAKER_02 3:32

Yeah, I think I I would add, yeah, I'd add one other. I fully agree with that. One other thing that I'm seeing is AI has moved to production now, and production happens in a lot of places, right? It's not just in the one data center that you got GPUs from initially and you started to run all your stuff in. You've got your first model up and running. Now you're trying to do, you're trying to inference. And that might happen, you know, still in that cloud data center, but a lot of it is gonna happen on-prem and a lot of it's gonna happen in the edge. And so, you know, you can't just have it sitting in one place anymore. So just by the nature of how pervasive AI is gonna be in all the dip all of our customers' business processes, it has to be everywhere. Therefore, just by definition, it's got to go hybrid.

Private AI vs Hybrid: What Actually Works

SPEAKER_03 4:14

No, absolutely. Mike, bringing into the conversation here, I mean, I think we we see a lot about you know private AI or private cloud as you know, talking about GPUs or managed platforms. What what right now, as we sit in the market, heading into the bulk of 2026, what actually qualifies for the concept of private AI?

SPEAKER_04 4:32

Yeah, I I think that the the important point there that was brought up, I think we we've gone from I think both from a market perspective as well as in in terms of how our customers are thinking about it, is it it isn't about a single data center or a single physical location where these, you know, these algorithms and computation are going to happen. It's it's already we're starting to see that proliferate out. And I think you know, there are two sort of forces that that come into that. And by the way, I think, you know, the the panel here would agree, we we've seen kind of these themes play out in the past. Like I think we we can draw some similarities from it. But as we think about prior edge compute, we think about the advent introduction of robotics beyond kind of just industrial use cases alone. We need to assume that these functions or these computations are going to be happening, you know, anywhere from, as we're seeing, our laptops, our phones, to, you know, hospitals, clinics, branches, et cetera. And you know, you you need an extensible architecture and ecosystem to be able to support that, you know, the those different flavors and different different versions. I I think one piece I would build on, Chris, in terms of your comment about risk compliance, data sovereignty as well, is I think the the cost to, you know, two or three years ago, and I and I do, I want your feedback on this too, both of you, but you know, the the idea or the cost of potentially training your own foundational model, you know, we really kind of thought, I think three or four years ago, like that's gonna be like eight really big companies that are gonna go out and do that. And I think definitely the case that the base or foundational models are gonna be there, but more and more interest from customers on how do I, in a controlled environment, train this model on just my data or just the influences that I want to apply to it for the purposes of the intellectual property I've collected, the points of view that I have that bring differentiation to the market. And for me at least personally, like three or four years ago, guys, I would have thought, like, oh wow, that's gonna be really, really expensive or difficult to do. And I think we've seen things from Deep Seek to other, you know, other releases here from the model makers that afford us the opportunity to think more creatively about the the what's what's possible in a private AI environment.

SPEAKER_02 6:44

Well, you know, I think we all thought that, right? It was, you know, you've got Musk saying it's gonna be billions to train an AI, and he's not wrong.

SPEAKER_04 6:52

Right.

SPEAKER_02 6:53

But you know, all the models, especially the open models now, and we're really focused on open models. Lots of companies are focused on open models now. I I don't see a strong reason for folks to to train theirs from scratch almost ever now. You're much better rag and fine-tuning and things like that. I think the other thought is a lot of our customers now, they're they're starting to focus on like the I feel like every every every company should be a model creator now. What you have internally, the most valuable thing, we've been saying it since you know, data lake, data swamp, data, whatever you want to call it, but your data is your most valuable asset. You want your data sitting inside of your LOM that you're using. And in a lot of cases, you know, we're working with custom with customers that say, wait a minute, my data is super valuable. I can, this is a new revenue stream for me. I don't just need to give the data to other people or sell it to them as it sits. I can just train this LOM to do it, and now I can sell this service. So it's turning so many companies that we work with that are just very capex or physical goods uh uh focused. And it's turning them into services companies as well. So I totally agree. Like fine-tuning these things is the way to go moving forward. Chris, what do you think?

Open Models: Opportunity or Risk?

SPEAKER_01 8:07

Yeah, yeah. It's a it's you gotta be smart, right? I mean, it's uh you know, building gigawatt data centers to try to train a mega LLM is you know not something that the average enterprise would need or want to do. And starting with what's already out there and fine-tuning is absolutely the pattern. I think what we haven't seen fully play out yet is really the emergence of domain-specific small language models. So that's gonna take a little bit more time. That's gonna, I think, further drive down efficiencies and really make things more interesting. Like there's some really good ones, right? If you're looking at like language translation as a good example, very common generative AI use case. There's a lot of good stuff already out there in open source that you can take advantage of. And as these start to proliferate, I think the other thing that you're thinking about here as you're bringing AI to your data centers or your edge sites is the energy demand, right? So we we see the pressure on the power grid. Uh, if you look at some of the estimates, the energy demand is going to more than double by 2030. And certainly supply can't keep up with that. So you have to look at what are some of the techniques I can use to really lower overall energy requirements for AI and being smart about your model choice is one of them.

SPEAKER_03 9:30

Yeah, Chris Chad mentioned, you know, that data is your greatest asset, and you're talking about you know, power here at the tail end of you know, your recent answer. As it relates to kind of breakthroughs for what private AI delivers, you know, we hear from so many of the organizations, you know, Mike, that we talk to on a regular basis, talking about how you know we've adopted AI or to in a certain extent, but we can't scale it to you know a certain extent. So from a private AI perspective, is it that data control? Or you know, what is the big value there that's helping us scale with our AI initiatives? Does it have more to do with data, cost predictability, kind of an all the above type of answer?

The Model Trade-Off Nobody Talks About

SPEAKER_01 10:05

Yeah, I think it's I think it's a few things. So data organizations should have a hard principle on data, which is it's your data. You shouldn't have to move your data into somebody else's proprietary data uh repo format just to gain the benefits of AI. That should be just something that you're not willing to cross because if something better comes along, like how do you get your data back? How much is that going to cost you? Right? The the exit costs can be enormous. So data is a core principle, being able to bring the AI model to wherever your data happens to be, not compromising on that, not having to necessarily rewrite applications to gain the benefits. And the other side of this, too, is private AI to us is about being able to safely operate AI in any environment. And that's an important distinction because there's a lot of focus on the AI token. And I think the token is the easy part. Like generating a token with the right infrastructure, right capacity is really not that hard. But how do you have observability? How do you have resiliency? How are you enforcing access controls, especially with a Gentec AI? These are the things that keep people up at night. And this is where when you're thinking about production private AI, which is where a lot of these use cases are, they're production inference, it's having the tools, the controls, and the automation to really make it all happen and make it resilient and cost effective.

SPEAKER_04 11:28

If I could just add to that, Chris, I, you know, the another pattern we we would see there is I think, you know, again, if we were talking a year or two ago, it would have been about AI being kind of, you know, sidecar to core business functions or business processes. As customers deploy more agentically, those that computational exercise needs to happen as close to, if not embedded, in the application, because the data isn't in some lake somewhere. It's not in it, it's happening real time in the applications that that people are using. And you know, it speaks to, again, kind of, you know, data center, edge, cloud. Like, you know, we we just have to presume and and and really have seen already the more agentic of an architecture that we've deployed here internally. I mean, you both are you have a bunch of internal initiatives where you're using your stuff, not just the work you're doing with customers, but the more that we go through that pattern, the more the recognition is the closer we can get, you know, I'd say really that agent working to the applications themselves or the data that's being generated in that point in time, the better results we get, the more, the more, the more accurate we are. And and I that just feels very different in terms of a consciousness in the market today than than what was out there as people were, you know, broadly thinking about agentic versus practically implementing it into a into a business process.

SPEAKER_03 12:51

Yeah, it's interesting, Chad. You know, I'm gonna spin to you here. You know, Mike, you're talking about how the data needs to be kind of as close to that, you know, location as possible. A little bit of a devil's advocate here question, Chad, but you know, what are the implications for moving to that type of uh process? Are there any trade-offs that need to be considered by organizations recognizing that all decisions have some type of impact down the line?

Who Really Controls the Data?

SPEAKER_02 13:14

Yeah, yeah. Now there are, there are. I think I think it is it's a really important point to focus on where agents are going to work. Agents are going to be everywhere as well, right? And latency matters. Latency matters a lot. Latency matters, where is your data sitting? Also, you know, choice matters. Folks want to control their own costs. They don't want to get stuck in, Chris. You bring up a great point. Getting your data out of places can be really hard. But, you know, when you do move to hybrid, there is, you know, there is something that you do have to consider, is you it takes more planning, right? When you look at cloud-based APIs, it's very easy. And it's why at the beginning, I told everyone, I mean, I say, you know, back in September of 2022, all the way to probably a year in, I told folks, nope, just go into one of the hit one of the cloud APIs, be very, very careful with your data. But but hit those and use those because you have to spin up the infrastructure, you have to know the people, you have to work with the right folks. The beauty here is, I mean, I'd say with so many of our deployments now, I could name probably five customers that we're working on with WWT and with Broadcom to deploy this stuff. The good news is now it's very feasible to do. You do lose some of the convenience of being able to just like say, I want to run something now if you don't already have the infrastructure stood up. But anyone that didn't realize they needed the infrastructure set up by now is probably not paying attention to the right things.

SPEAKER_03 14:38

Mike, I'm I'm asking you too, because I I just before we started recording, you're talking about how this is a very relevant conversation that's happening in real conversations with customers. From those conversations, are you sensing that the market is is ready right now from a maturity standpoint, or are we still in an early stages perspective?

Edge vs Cloud: The Latency Problem

SPEAKER_04 14:57

Yeah, I I I still think we're we're early stages, but I I I don't think, you know, I'm I'm kind of an optimist on this stuff. I I don't think we we give ourselves enough credit sometimes in terms of you know the rate and scale that that we're moving here. The the the real barrier to adoption you know from our broad-based adoption within our customers is data maturity. Yeah. You know, and and I think, you know, Chris, just some of the points that you made earlier, I think in in in some part that has that has been created by this sense of everyone building data lakes or you know, data repositories around systems of record. And and they got pretty good at that, but there's so much more now that language, you know, these models can interpret that we didn't even think about, you know, in terms of collecting, structuring. And, you know, in the most extreme cases, we have customers that are, you know, taking things off of tape and getting trying to get them back into the, you know, the corpus of data that they're able to include in these models. And so I think we're our limitations are not in a technical sense in most of the engagements that we're working in. It's to what extent are we able to collect the data? To what extent can we tolerate, you know, in the early days the fact that I don't think it's been uh ever a better time to have imperfect data because these models can help us sort that out, you know, whereas before we needed every I dotted and every T crossed. It's like, okay, we can we can go through this for certainly health and safety is a different category than you know, some of the business productivity things that we're doing. So I I'd like to give us a little more credit than than we're due here. I think the most popular use case that we're working with, and I like this is right squarely in what we're discussing here today in terms of intellectual property and the importance of having control of these things are coding assistance and how this has helped developers and and really engineering in general, I think, become more productive. But in fact, you know, the conversations we're having with our customers is having their code base in a, you know, somewhere else or, you know, not knowing exactly where that is or who's kind of in control of it is something that we continue to work through with our customers. And if they're doing that for developers today, we see the opportunity that automation engineers and and other engineering or you know interpretive kind kind of roles and functions across the enterprise are gonna benefit from tools like this. The coding assistant to me is just sort of the beginning.

SPEAKER_03 17:19

Yeah.

SPEAKER_04 17:19

And we're gonna kind of play out some of the gates that we're we're working through in that context across multiple personas, you know, within an enterprise.

SPEAKER_02 17:29

You just brought up an awesome point on not waiting until the data is perfect to start fitting them into these models and using them. I see, you know, I was the back go through all the different databases and all the different analytics tools that we all used. The data being extremely accurate and extremely well formatted was was was paramount. Now it's not. And I do see kind of a split in how companies handle this. Some companies still think the way of we got to get this perfect, we can only take data from this source because this is our source of truth. You can put weights on the data, right? So I think it's important that organizations understand, like we do, we do this a lot internally now, is you can as long as you make sure your data is safe, as long as you make sure you're putting the right data in there, it doesn't have to be super well structured. In fact, we're we're shocked all the time by how well it handles some of this data.

SPEAKER_00 18:24

This episode is supported by Eaton. Eaton provides power management solutions to ensure systems' reliability and efficiency. Optimize your power infrastructure with Eaton's innovative products.

SPEAKER_03 18:37

Yeah, I think you were about ready to jump in there, but you know, Mike had just mentioned about you know how data maturity still stands as the number one blocker here to scaling, anything for that matter, but in this case, private AI. What other blockers exist out there that has to kind of be a one, two, three, or uh, you know, a checklist of items that organizations need to get to to execute effectively?

Your Data Isn’t Ready for AI

SPEAKER_01 18:59

Yeah, I think there's uh there's a few things that you have to think through here. And I want to go back to uh Chad's earlier point around you know, how do I how do I think about my even my application portfolio overall from say a cloud-based AI service to an on-prem or private service? And one of the things that we had done very early on is to align around an OpenAI compatible API. We we expose that through our API gateway so that if you started building an application on OpenAI or a compatible platform, you're simply just changing a URL pointer to our hit our gateway as opposed to a cloud service. And now it's completely running in a private environment without having to do any code modifications. So there's there's tricks you can do like that. You don't have to fall into some of these all or nothing traps. And you know, as you start to think about that initial private deployment, because this comes up a lot. Like people say, well, where do I start? And I say, well, start in the cloud. Because if you don't have a use case, you don't know the use case, leverage on-demand capacity to find it. When you move to decide I want to run this as a production inference service, the economics far outweigh a public cloud model today by a long stretch. We've continued to rerun the math just to check our own assumptions. And you can put it on the spreadsheet, you can do the per token cost comparisons, and it's fairly consistent. And I mentioned that because when you get to that first use case, now you can start to size infrastructure for it, you make the investment, you stand it up, and then you're gonna have times where you're gonna have unused capacity. Now you can use that unused capacity now for additional experimentation and start to do some more GPU as a service, so you're maximizing utilization of that capacity all the time. And this is where having the right automation tools can really make a huge difference for an organization. I can't tell you how many times I've seen somebody bring AI capacity into a data center. They redo all of the data center power infrastructure to support the AI capacity, and they wind up being able to use maybe like 40% of it. Yeah. It's like, how much did you waste, right, just with a really poor design versus looking at techniques that would allow me to subscribe or even oversubscribe my capacity. Like you can set up spot instances for some of your research workloads so that I am able to dynamically shut them down if I have a say my most critical production inference service has a GPU failure. Like these are things that you can start to apply very early on, again, with the right partners. And that's why I think you have the three of us together here, because these are the things we can help organizations with.

SPEAKER_02 21:35

Yeah, the tools point is really important when it comes to orchestration, because once you saturate GPUs in the cloud, you know, versus saturated GPUs on prem, yeah, the cost is is is pretty clear. The the area that folks have been concerned about historically is well, how do I keep my utilization high? Right? Like I don't want to pay for things I'm not using. But there's so many schedulers and orchestration tools now. You've got Slurm. You've got Run AI, you've got a lot of different tools now that can help you not just split up the GPU so they can be used by others, but put priorities, right? You may have five groups that all say they want 40% of a GPU capacity, but every time that happens, like you know, they go to sleep. They're in different places, they're running big jobs, they're not running jobs. There's plenty of schedulers out there now that can that can make full use of the on-prem private AI.

The Real Cost of AI Inference

SPEAKER_04 22:28

Yeah, I'm gonna I'm gonna steal the moderator seat here for a second. Because just as you both were talking through that, you know, first, I uh so much of what you're saying is resonating with what we're doing and seeing in our AI proving ground. And you you both have been fantastic partners in kind of building those capabilities out. And and a couple of the trends that we've talked about that just really kind of triggered a thought. I I want to get both of your perspectives on this is that you know the more agentic we get, the closer that these agents get to applications, and frankly, the more important those agents are to the conduction of business processes and how we service our customers and everything else. Like those agents become production applications themselves, right? They're just as important as the employees we have on the ground that show up every day and interface with our customers and one another to serve the markets that we work in. As you think about the work that you all are doing to help our customers operate with consistency, with quality, with with all the things that kind of came with the advent of virtual machines and container-based architectures, is like the these high-performance architectures, these AI workloads need the same care and feeding. And you mentioned like you know, the the run AIs and some of the other things that are, you know, can carve up the hypervisor or the uh the GPUs rather and provide those services. But I know there's a lot of investment, both uh internally and acquisitions you all are making. I'd love to kind of give you a moment to kind of share, maybe starting with you know, Chris, on your side, the investments that you're making in helping customers more reliably operate these high-performance architectures and AI workloads.

Why GPUs Sit Idle

SPEAKER_01 24:13

Yeah, yeah, let's talk about that because we've been really focused on this and seeing uh really investing in the shift to production inference before it became mainstream. And you know, most folks that you talk to today, they're gonna tell you in the next couple of years your AI inference capacity relative to training capacity is probably somewhere in the 60 to 70 percent range. So there's a huge momentum going in this direction. We purpose built our solution for these things. And really the way we thought about it was if you're gonna do AI at scale, you've got to have your infrastructure as a service house, or excuse me, your eye as layer really set up really well, right? That house has to be in order. So we do that with Cloud Foundation and giving you a baseline, and that's built on technology such as our distributed resource scheduler because AI capacity automation is not just about GPUs, it's CPUs, it's memory, it's network I.O. It's storage I.O. Right, it's very complex. And this is technology we've been evolving for the more than the last 20 years now. We have native hooks into run AI, so these these tools can work together. We have hooks into the dynamic resource allocation project. So that's one piece of it is that IaaS layer. Then as you go up the stack, we've done a lot of work too, just to make service enablement really easy. And the way I like to explain this, and the way we like to partner, is we we look at these as services that you'd get inside a server operating system. They're there. You can use what you want to use. If you want to use a partner service, we're like, hey, cool, no problem. That would that's good for us too. Everything that we've done northbound is open source. So you have CNCF conformant Kubernetes. I mentioned DRA support. Our the way we do model governance is was one of the first in the industry, actually. It's it's based on open source harbor. AI models are shipped as containers. You do not need a proprietary tool to govern containers. You can use tools like Harbor. It's already there. All right, so we made that really simple. We have a CLI that will hook all of these tools into NVIDIA's NGC Cloud. You can hook into Hugging Face if you want to pull down open source models. Again, these are all just native developer constructs. Uh we support Nvidia's Triton Inference Server. And if you want to use like open source, we support the LLM, Lama CPP for different types of inference engines. And then as you go across, we have our own agent builder. So we've done a lot of work to bring all of these AI services in. And I'd say the other industry first for us, if you compare us to some of these other infrastructure providers, all of our AI services, they're just included free with our IaaS platform. Everybody else is charging them as an add-on, but to the point you made, Mike, AI is going to be in every application. So if it's just going to be there in every application, how on earth can you just charge to enable AI at this point? We just think it's table stakes. And that's how we've packaged our software, and we look forward to our competitors following us because that'll be good for customers.

SPEAKER_03 27:06

Yeah, Chad, build on that a little bit. You know, what what's your reaction to to a little bit about what Chris is saying? But then also, you know, how are how is NVIDIA and and Broadcom working together just to make all of this um easier? You know, Chris had mentioned how you know the economics are there if and when you're ready to make it happen. So what are the two of you all doing together to really enable all that?

Running AI Like Real Infrastructure

SPEAKER_02 27:28

Yeah. So it one of the things that we love about our relationship with Broadcom and VMware before was that we we all have almost this intrinsic belief that AI should be open source as much as possible, right? It's it's so important. And you see it, you see it with some of the earliest, you know, kind of the grandfathers of AI, there, how focused they were on trying to keep things open. And it's not just because you know, we want everyone to be able to use it, that's important, but open source makes things better. You know, I started my career at Red Hat. I got a firm belief in that while I was there, which was not the standard at the time, but you've just seen it blossom. And because Broadcom and NVIDIA have embraced open source so deeply, it's really easy for us to work together, right? So, you know, Chris said it perfect. You want to use Triton, you want to use VLLM. We're a huge contributor to VLLM. We love to package up Triton and do a NIM to make it a microservice and really easy to deploy, right? It all works really, really well with what Chris and the Broadcon teams are doing. We've been, I mean, we've been working together over a decade. Do you look back to VGPU and to the other pieces that we work together with? I think that that partnership being based on kind of a foundation of everyone believing open source is important. It's also just just to allow tools to collaborate has made it super easy to partner with Broadcom. And it's something that we continue to embrace. If you look at, we just you know, look at look at some of our acquisitions, right? We acquired Run AI. Run AI is a scheduler. Chris mentioned it, their stuff plugs in beautifully, they work together. But we bought Runai because the GPU orchestration was becoming extremely important. And yes, the cloud, the cloud providers have kind of figured it out, but everyone needed to be able to use it. So we buy them. We bought Skedge MD, right? You know, we use Slurm is extremely important when you start looking into the HPC environments. All of these have just massive open source components or are entirely open source. Now, our latest iteration of this, if you look at the NemoTron models, we love all the models out there. We love them because most of them run on NVIDIA GPUs, right? But one thing we found is open models, open models can be really expensive to train. And even if they're smaller models, they can be extremely expensive. And it's also, there's not a lot of organizations that are incented to have both open weights, which is what you hear of open models these days, and open training data sets. So that open training data set is something that isn't talked about as much whenever we talk about open models. So our customers were very clear they needed it. Nvidia has a reason to create these models, right? We want AI to become pervasive. We want GPU usage to be out there. And we want, especially when we start going into the sovereign AI path, which we could have a whole hour discussion on that, especially when you go into that, the open models are very important. And so we've created many different models in NemoTron, you know, whether you want to do anything with voice to just very large models, like you would see with some of the other providers. That basis of open source, it just makes it makes it really, really easy to work together.

SPEAKER_03 30:53

Mike, and then to I guess kind of take your question that you had when you very effectively took the the moderator away. Yeah. So no, no, no problem at all. Anytime you want to jump in, I'm happy to have your questions. But you know, I think you're kind of asking uh the two of them, you know, what is this you know evolving story and where are you all coming from? Understanding what Chris and Chad just said, how are you synthesizing that for you know for our customers who are just trying to make sense of a very complex landscape?

SPEAKER_04 31:21

Yeah, I I think it is uh outside of the data, the the operation of these environments at scale are some hard yards right now, you know, for for our customers to carry. And you know, I think the the work that's happening around making these environments more accustomed to what they're used to operating is is some of the, you know, that's some of the work that's being done here, frankly, in in terms of integrating these two platforms together. I I still think there is a sort of an emerging recognition that as you get into more agenc, these are I don't have a better name. You're gonna have to work work on it, but you know, the these agents really are they're they're coworkers. They're they just happen to be done doing this in a digital sense. And and when they don't show up to work, things don't get done. Or when they're not performing well, you know, things don't get done. So I think it's all the integration kind of that that we talked about here, but it's also this idea that, you know, and and and you both could have, you know, I welcome your insights on this again. I need help with it, but it's like this is as you know, as important a services-based architecture as we've ever built, you know, in terms of these microservices are now agents, they're people, you know, digital humans that are doing this work. And how we imagine, I mean, I even look at some of the functions we have here internally. You know, we're we're gonna have 10,000, 20,000 agents here, you know, working inside of the organization that we have. How are we ensuring that they're all showing up to work, that they're productive, that they're talking to one another and doing so in a secure sense, you know, that the power of that 24 by seven availability and the productivity that that's gonna bring in conjunction with our employees, you know, the people who are doing some of these jobs and supporting them, it's really important work. I mean, it it in it and it's still sort of kind of developing here in terms of how we're gonna keep track of it. But if if you are a CIO, CTO, application leader, and you're thinking about the 5,000 apps that you use to run your enterprise today, you're gonna have 5,000 apps and you're gonna have 50,000 digital workers. And, you know, the the our collective responsibility is to how how do we best operationalize the management, the productivity, the security and resiliency of of all of those assets. And you know, it's it's gonna be hard yards to come, but I I think we're we're we're seeing it, or at least we are with our customers, and I speak for the the folks in the panel here. I think we're seeing more and more of that realization and recognition. So it's it's an exciting one to chase. Yeah.

Inside Broadcom’s AI Strategy

SPEAKER_01 33:56

Yeah, I think Mike, you you really put it well with the football metaphors there. The hard yards to come, it's like the Philadelphia Eagles offense this year. So I can I can definitely relate as a Eagles fan. But I think on the hard part, the thing that we we've kind of talked around here that we we I want to just circle back to is NVIDIA's done a phenomenal job with NIMS and inference microservices. And the magic in all of this is you have more than a hundred of these things. And Chad's gonna know the exact number, but what's really cool is when you're trying to put a model on an accelerator and on infrastructure, you need the right device driver, you need the optimized kernels, you need the optimized framework for these. So there's this entire software stack that has to get certified around that model or agent that you're looking to deploy or pair. We do all of this end-to-end. So I can just pick from the catalog, I can drop it down onto my infrastructure layer, and you that full software stack is just delivered, it just runs. So if you think about how fast you can iterate and experiment, right, and just try new things. Because to Mike's point, if I'm looking at tens of thousands of these things, you got to enable your teams to move fast. And that's what we've been able to do with our partnership.

unknown 35:04

Yeah.

SPEAKER_02 35:04

Yeah. You know, the the I I love that, Chris. The tools are there. Like the tools are all there. And there's a lot of companies that are working hard to try and make them easy to use. I think there's a daunting aspect sometimes whenever you look at this and you're like, gosh, how do I enter this? How do I actually get into production? And I've got to say, just a plug for WWT on this, we work with a lot of very, very strong deployment partners that go in and they architect and they build and they deploy people. They do all those things to help the major customers be successful. WWT rivals some of the best of the really small partners that have their just AI geniuses, right? They can they can go head to head in a small environment or in an architect, an early architecture. I think the thing that is pretty incredible about WWT right now is I'm seeing that as those customers go from initial architecture to POC to, oh my gosh, this is actually working. Oh, this agent is actually helping, you know, my customer service department, WWT has been able to deploy the right people with the skills, the AI skills necessary. It's a big piece to make those pervasive, you know, in medium and especially large to huge organizations, I've seen it. And that's been it's been impressive because you always wonder if as these things scale, there's gonna be enough people to help. Now, part of that is the work that, you know, the kind of work that Broadcom is doing to try and integrate all these things and make it an easy deployment. But the other part is just getting started and architecting it takes, you know, it takes some confidence. And I think that WWT has done that for a couple of our very, very large, very complicated customers.

SPEAKER_01 36:46

I would just want to add that there's one other key difference that I've I've known. Like Mike and I go back now, I don't know, 10 or so years at this point. And the thing that I've ex really respected about WWT is, you know, Chad you nailed it with the tech prowess around AI, but what sets WWT apart is integrity. As a company culture, like WWT always does what's best for their customers or clients and really always serves them in their best interest. And you can't say that about a lot of other uh vendors or providers that you might look to work with.

SPEAKER_03 37:17

Well, we certainly appreciate those comments and we appreciate the partnership uh from the two of you and your respective organizations as well. We are coming up a little bit at the bottom of the episode. Um, so you have to hurry it up a little bit. Chad, I want to stay over with you for a moment. You know, it certainly doesn't appear like the hybrid approach is gonna be fading away. How but how do you see those hybrid architectures, you know, evolving as we start to explore what should be private, what should be public, what should be in between?

NVIDIA, Open Source, and the AI Stack

SPEAKER_02 37:43

Yeah. Yeah, it's a great, it's a great question. I I think I think what we're seeing right now with robotics, you know, we we we're all hearing about humanoid robotics. People are questioning how real that's gonna be. Now we just heard that Muska's Tesla is gonna be taking a lot of their capacity in their manufacturing facilities and moving them over to humanoid robots. We're seeing robotics accelerate really, really quickly. And I think the why behind it is fun. So take a slight detour. The fun part is we were doing, I was working with Omniverse, which is kind of our metaverse digital twinning application before Gen AI really came out. Gen AI comes out and all of a sudden the acceleration just went parabolic. And the reason is because Genai can help train these robots in a simulation. And you know, we've got lots of examples now where in a you know less than an hour we're training new robots how to walk on unstable surfaces. So I think what that drives to get back to your question is the hybrid architecture is gonna be absolutely critical. I think there's two major forces I see. One is people are moving to production and they're starting to say, ooh, maybe some of these workloads make more sense on infrastructure that I own, where I can I can know exactly what my performance is and my security is and all those things. So that's one, that's one of the major drivers. I think the other driver is gonna be where the data sits and where the data needs to be worked on, right? In a lot of cases, you want it on the robot. You want it at the physical location that might be remote somewhere. And yeah, sure, you may throw the data back to some central repository one way or another. But I think that move in hybrid is gonna be extremely it's hard, it's hard to overstate how transformational that move will be to all of us and all of our companies if it goes the way that we're starting to think it's it's gonna happen.

SPEAKER_03 39:35

Yeah, Chris, where do you see that boundary going? Is it going to just continue to be a little bit of a gray area for the foreseeable future?

SPEAKER_01 39:43

Yeah, I don't, I I think I'm seeing more and more folks get more intentional about it and understanding, you know, what I can do in in each environment. There's there's enough common design patterns here that are really helping us. You know, you have tools like like Run AI that have been focused on these environments for years now. There's things in open source like Skypilot as an example that's really focused on some of these hybrid AI management and automation use cases. So there's there's work to do, but there's there's still if if we back up, like Mike's talked about agentic AI a couple times, you know, the the the MCP protocol just got OAuth in like, I don't know, August or September, right? So, you know, figuring out sometimes, yeah, the hybrid architecture is one thing, but sometimes it's the basics. Like, like how am I doing tracing between multiple MCP agents? How can I prove to an auditor that these like Mike said, all of your digital employees are not doing things they shouldn't do, or passing information to each other that one doesn't have access to or shouldn't, right? These are the I think the low-hanging fruit that are going to be really critical for us as we uh start to evolve this forward. But I I think that the part of the I think the most important part on hybrid though is accepting the fact that private cloud is a legitimate deployment target for AI services. And what's happened that for me with multiple like chief A officers, chief AI officers or chief data officers that I've talked to, is when I explain the privacy, the control, and then the economics. Like if you can cut your costs in half and you choose not to do that, you're doing a disservice to your business, you're doing a disservice to your shareholders at that point. You truly are. Like, why wouldn't you do that? And if you're ambitious about AI and you can cut your costs in half and now you can double your AI initiatives, then awesome. Like go for it, right? Like that's so that's really the the biggest blocker, I think, to hybrid right now is just some of the myths that are out there that were perpetuated by some of the mega hyperscalers that want to have all of your workloads and have all of your data. You don't have to do that. Like, we can help you. Like, come work with all of us. We'll show you how it's done.

NIMs: The Secret to Faster AI

SPEAKER_03 41:57

Yeah. Well, recognizing that, and Mike, we'll we'll end on on you here. What has to change either technically, operationally, or or maybe even culturally, if private AI is going to continue to be more and more of a viable alternative, what has to change for it to become more of that kind of you know, boring, predictable infrastructure that we know is gonna deliver?

SPEAKER_04 42:15

Yeah, I mean, I think a lot of the innovation we talked about here today is is what's driving that. And and you know, as we we think about our business, and and and I appreciate the kind words from both of you, kind of, you know, in terms of of what we're doing together and that, I think having a high trust relationship with individuals that are squarely sitting in the spaces to try to solve these problems and you know, learning from the proving ground, giving you all feedback on what are the customers asking for, where do we stand, what do they care about? This is where I think the right partnerships, the right relationships. And I would say, you know, for us, it's as important, you know, what we do as to how we do it are equal parts important. And I think these conversations here today, from my perspective, represent where the market needs to go, where we are to an extent in terms of the progress that's been made. And as more and more of our customers understand, Chris, what you've laid out, you know, Chad, the points you've supported in in the discussion here as well, it it's almost irresponsible not to be considering how you would take this on in in terms of a private architecture that's going to, again, the AI that we have today is not the AI we're gonna have tomorrow. And the proliferation of these agents and and how they're going to impact our productivity is is exciting times ahead for us.

SPEAKER_03 43:34

Absolutely. Well, to the three of you, I know we're closing up on time here. Thank you so much for for joining. Uh, Chris, Chad, thank you for joining virtually. Thank you for the partnership uh from the two of you as well as Broadcom and NVIDIA. It's always a pleasure to catch up, and we hope to see you soon.

SPEAKER_02 43:48

Thanks.

SPEAKER_03 43:50

Okay, thanks to Mike, Chad, and Chris for joining. What we heard today is simple private and hybrid architectures aren't philosophical choices, they're operating decisions about where the data lives. How your agents behave, and whether you can scale AI without losing control of cost, security, or performance. This episode of the AI Proven Ground Podcast was co produced by Nas Baker and Kara Kuhn. Our audio and video engineer is John Knoblock. My name is Brian Felt. Thanks for listening. See you next time.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology