AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology
AI deployment and adoption is complex — this podcast makes it actionable. Join top experts, IT leaders and innovators as we explore AI’s toughest challenges, uncover real-world case studies, and reveal practical insights that drive AI ROI. From strategy to execution, we break down what works (and what doesn’t) in enterprise AI. New episodes every week.
AI Proving Ground Podcast: Exploring Artificial Intelligence & Enterprise AI with World Wide Technology
Inside NVIDIA’s AI Factory
NVIDIA's John Gentry and WWT's Derek Elbert outline how AI factories are evolving from isolated GPU clusters into flexible, multi-tiered, distributed systems — and why data strategy, security and ecosystem partnerships now determine whether enterprises can keep pace.
The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.
Learn more about WWT's AI Proving Ground.
The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.
Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.
From Worldwide Technology, this is the AI Proving Ground Podcast. Today, beyond the hype surrounding generative AI, a more practical idea is taking shape in the enterprise: building an AI factory. It sounds simple, consume data, produce intelligence, and scale, but in practice, it's become one of the most confusing and consequential ideas in enterprise technology. Because beneath that buzzword is a real shift in how modern organizations will build, run, and secure the AI that increasingly powers their business. And few companies have shaped this conversation more than Nvidia. Not by inventing the term, but by pulling it into the mainstream, defining the architecture, and showing the world what an AI factory actually looks like. Today we're joined by two experts at the center of that movement: John Gentry, a key member of NVIDIA's AI factory engineering team, and Derek Elbert, who leads many of worldwide technology's efforts to help enterprises operationalize these factories. Both of them spend their days inside this tension between what AI could be for the enterprise and what it takes to build it responsibly, at scale, and in the real world. And why getting this right may be the difference between organizations that simply adopt AI and those that learn how to manufacture intelligence as a core capability. So let's jump in with John and Derek. Okay, so in the broader industry, AI factories undeniably become a buzzword. But in practice, it's proving to be a genuinely useful analogy for enterprises. So, you know, if it takes this highly complex process of building AI systems, whether it's foundation models or fine-tuned models or agentic systems, and that simplifies it into something universally understandable. So, like any factory, it's about inputs and outputs. On one side, power and data, on the other, intelligence. But more granularly, an AI factory requires clarity on what you're assembling, the outcome you're trying to produce and the tools or machines inside the factory that make it possible. So, just from a practical standpoint, those tools translate into compute, networking, and storage and increasingly delivered as purpose-built, fully integrated systems. Everything from the mechanical and electrical design to cooling to software to management, upper level frameworks contributes to both running the AI factory and enabling the outputs it creates. The analogy definitely holds up, but it becomes meaningful only when you start defining how it works in the real world. So, given that framing, that was a little long-winded, Derek. But how do you see enterprises actually operationalizing this concept? What's the first meaningful step they should actually be taking to turn the quote unquote AI factory analogy into something actionable inside their organization?
SPEAKER_00:So I think from a WWT perspective, right, like we've we've added this to what we've called our flywheel, right? So AI Studio, AI, AI Foundry, and AI Factory, AI Factory being the build and the scale, um, which focuses on those hardware components that John mentioned, right? Uh compute, network, storage. I think the one piece that we like to incorporate into that in in terms of the stack that wasn't really mentioned was uh AI workflow orchestration and cluster management. So how are you bringing all of those um components together to actually produce the business outcome or the solution, right? Like to John's point, you know, the concept of a factory to me means you are making something that is going to then be delivered, right? And in this case, a lot of times what we're doing for the customers is we are driving an application or a business use case or an actual solution that they can either A, sell to their customers, right, and make profit off of it, or maybe B, it's it's something that they're going to use internally uh for their own employees to gain better experience, better efficiency, whatever it may be. But they're actually going to produce something. Is it going to be actually tangible uh that they can touch fill? Maybe not, but it's definitely something that they can then either A take to market or take to their internal users to be consumed. Yeah.
SPEAKER_01:Yeah, I think you make a great point uh there, Derek, in terms of um beyond the physical infrastructure. Uh and I would sort of divide that software layer up into a couple of different functional categories, right? You mentioned them. There's cluster management or infrastructure operations. That's the actual how do I stand it up, provision it. I mean, even that aspect of it is is multi-layered and very sophisticated, from making sure I have the right driver stack to OS to orchestration. What am I running in in terms of the underlying infrastructure software? And then you have how do I lay on top of that the workflow, the AI workflow tools, is it some flavor of ML ops, which is highly fragmented still, and there's a whole ton of options. What other tooling am I bringing to bear? And then what I'm finding, and I'd love to get your perspective on this too, Derek, is when we bring this to enterprise, so we've moved beyond the frontier model builders and the big sort of you know investors in they have their own workflows or they're working in the bleeding edge to enterprise. They expect this to be operationalized in the context of their enterprise IT environment. Everything we've talked about from cluster management to workflow orchestration to you know overall AI workflows have kind of existed outside of that construct until now. And so I get a lot of questions. How do I operationalize this? How do I integrate this into my existing IT workflows, my existing IT operation stack? And so, yeah, beyond the physical infrastructure, there's a whole lot of software and both on the infrastructure software side and the application software side from an AI perspective.
SPEAKER_00:Yeah, and and with that, like uh there's a huge wrapper of security around all of this, right? And so there's two areas where our team isn't necessarily um the experts in, but we have to reach out to other groups inside worldwide, specifically around observability and monitoring, right? That AI ops, and then uh clearly security. Uh, because the message is, you know, you need to make sure this is secure and the foundational security practices that you've been doing within your IT management stack already, they still exist, but now you need to layer in the fundamental components uh of security for AI, right? So how are you gonna protect the LLM? How are you gonna uh enable guardrails? How are you gonna protect the embeddings, the inputs, the outputs, right? Not necessarily focused 100% on the data, but more operationalizing it just like you would all of your traditional data center infrastructure.
SPEAKER_01:Yeah, well, and that's a shifting landscape too, I think as the the maturity around how these applications are being used and and the diversity of how they're being used continues to grow, right? When when we as NVIDIA first worked with customers, it was largely around model training. And it was largely around whether I'm, you know, easier training an actual model with enterprise data, which very few enterprises actually have the wherewithal to do. Mostly it's fine-tuning, it's retraining, but it's still oftentimes done in an AI factory that exists as kind of an island, as kind of an isolated factory that sits independent. And so it was cordoned off, right? Security was less of a concern because it was behind the firewalls, deep in sort of corporate IT or in a colo that was isolated and sequestered to only provide that function. As they move from that to actually putting these models into place, putting these outcomes into sort of application flows that are fully integrated and customer facing or internally facing, to your point, now the whole implications of security and perimeter security versus embedded and actual, you know, uh code security and so forth starts to come to bear. And so it has been an evolution as we've seen it, moving from training to inferencing to sort of the flywheel of bringing that inferencing outputs back to fine-tuning. Now this thing has to be integrated. And and one of the things that when I was thinking about the construct of AI factory, um, to use that analogy, an AI factory that isn't connected to the supply chain, and traditionally that would have been on the railroad where it could get the raw materials in and on the highway where it could get the finished goods out. The same analogy applies if you think about now this AI factory has to sit in in an infrastructure locality where data or models or inputs can come in, and the finished goods or the application, the inferencing outputs or other things can actually get to the users or be distributed. So it's it's kind of sitting in this hub now of a hub and spoke sort of construct. Uh, and and I think that's emergent. I mean, that that's just I think entering into the considerations as customers build out purposeful AI factories as opposed to experimental sort of you know use cases. Now you're powering a whole business application that's driving revenue or or productivity application for the enterprise. How do I make that available and keep it secure? Absolutely a different consideration.
SPEAKER_02:Yeah, Derek, I mean I I I love the way John's talking about that connect it into the supply chain, get it connected with the highway um infrastructure, so to speak, to keep with that factory. What keeping with that analogy, what types of challenges or constraints might that be putting on enterprise IT teams to actually realize the value of the ad factory?
SPEAKER_00:So when I think about the highway, the first thought that came to mind was data, right? Like data is that highway for this. And a lot of times within enterprises, especially, where they're not just isolating this data for training, right? And they're actually going to use it, they they have their data is a mess, right? And the reality is, is they that they want to use all of this uh proprietary data, right? This IP that they've been building for decades sometimes. But the fact is, is that they that that data needs to be cleansed, it needs to be prepared, it needs to be curated in order to then be used. And so what we're seeing is kind of a trend where, yes, these if you're gonna go out and deploy a large GPU cluster, right? Obviously, you need high performance storage that sits next to that, right? Um, like scratch storage is what it's been deemed as in a lot of these that we're deploying. But what we're seeing at other customers is that they're taking a much larger bucket, 20 to up to 50 petabytes, and they're using that as a as a staging environment, more or less, for that data to prep prepare it to be fed into the AI system, right? And and so that is a fundamental shift where everybody has has known that that data is important, right? Very, very implicable. I mean, good good AI only comes from good data, but now it is actually becoming imperative that these enterprises start doing something with their data. And and it starts with having an actual data strategy, right? But that was the first thing that came to mind when you said highway is like the data is that highway or it's running on that highway. And without that, like this factory is no good.
SPEAKER_01:Yeah, you you bring up something that is is absolutely um very timely, and it is a shift in mindset. We used to talk about what kind of storage do I need next to my GPU cluster, which was scratch based. It was load, you know, a data set that's already exists and you know, make it available to do the training or the fine-tuning. That branches to not a storage strategy, but a data strategy. And then what are the data repositories that I have in my environment? We first started seeing this when it was sort of sort of like a bolt-on, right? Which would be retrieval augmented generation, something like that, where I've got this large data store, I'm gonna actually vectorize it and I'm gonna allow my core AI to go reference it to get contextually relevant information or timely information. I I think of that as sort of a student that keeps their notes when they go to take the test. If the inference question is that is the test question, the AI knows enough to know where in the nose to go look, but it doesn't know the information. Increasingly, that's you know, moving to I need to actually train a model on that information. Um, I either need to train multiple specialized models or you know, really fine-tune a larger model to have all the context-relevant data. That's a completely different data strategy. That means a data pipeline that now can classify, can curate, can in many cases um select. I'm seeing a lot of conversations that we had early on around you know, legacy big data analytics applications that weren't really letting go of the traditional ML approach to embrace AI. Well, those are the best data sets to actually bring to AI, but it's a completely different discipline to query a data set versus prep a data set to train a model to ask much more broad questions. So I think this is an area enterprises are are still trying to figure out is I can't train on all my data. My data is all over the place. How do I bring a data strategy to complement my AI strategy? We're not even now talking, that's just all on the input to that factory side. The output of that then is connected to a whole different network of highways to get it out and make it available to consumers, be them internal or external, right? But I think you bring up a great point, Derek, around the state of data has become increasingly a challenge. I think we used to use the the relative ratio of sort of overall time. 80% of the time to train a good model is typically in the data prep to actually get the data ready to do the training, right? As opposed to the actual training time.
SPEAKER_02:Well, John, I know you're a key member of NVIDIA's AI factory engineering team. And NVIDIA is, of course, in a very unique position where it kind of sees end-to-end the entire ecosystem, so many partners and organizations plugging into what you're doing. You'd already mentioned some of the stuff like, you know, whether it's the quote unquote supply chain or highway system. But from what you're seeing as customers and organizations try to operationalize these AI factories, any other surprises popping up that you think um, you know, would have been taken care of or any trends emerging that are, you know, a little bit ahead of the curve right now that you can uh shed some light on?
SPEAKER_01:Well, um there are certainly a few things that that I find interesting. One is sort of I think the result of the conversation we're just having around data. Um, and the other is what's emerging in terms of what's coming next, which in the world of AI seems to be faster than anyone ever anticipates. It's it's so hard to keep up. And for enterprise, that presents a unique challenge. But in the world of sort of data, and and one of the things that um you know, NVIDIA really views itself as an enabler more than anything, right? And and we very much um try to be the and company. So NVIDIA and this partner and that partner, right? So, you know, we're developing capabilities that have been sort of labeled the AI data platform, um, and we're making that broadly available to a bunch of different providers to bring AI into the data tier to do some of that classification, curation, and so forth, um, to get to just make it easier and faster for enterprises to bring their data closer to the AI, as opposed to having to move all the data and data gravity implications and everything else. How do you start to share this? Um, and so the first thing there is that's one example of it takes an ecosystem, and NVIDIA I think recognizes that as much as anyone out there that we are not gonna do this alone. We are not experts in in every aspect of this. Let's make capabilities available to those that have that expertise, whether it's on the data side or likewise on the you know enterprise integration operationalization side, which is where partners like WWT play a key role. We're not gonna tell every enterprise here's how you integrate this into your identity and authentication management system. That that's something that you know WWT knows well, and and understanding what layers of access should be provided to what individuals in the organization. How do I keep certain things separate while making this a shared resource? I mean, the other aspect of the factory analogy is this isn't one group. This is a factory for all outputs that need to be done in the enterprise, and it's purpose-built for generating AI, but has to be flexibly applied to whether it's you know customer service for an internal use case or you know, product augmentation for a revenue increased revenue stream. Um, so you know, I think that that's one of the trends that's uh very, very apparent, and and we're investing heavily in that ecosystem in those partnerships, whether it's partners like WT, um, partners like the OEM manufacturers that are building the physical infrastructure associated with AI factories, or the ISV ecosystem that are adding AI to their applications, where we can help facilitate those integrations and see AI grow through not the core AI, but through the way a service now, as an example, is using AI embedded in their product offering.
SPEAKER_00:Yeah. Yeah so so the one thing I want to touch on, right, is um NVIDIA. So you're definitely an enabler, but you're also an influencer. And I say that because you are leading the charge, right? So when you coin terms like AI data platform, right, or AIDP, then you get all of these vendors on the storage side starting to build towards that or come up with come up with fundamental concepts that support that data platform, right? And we we are fortunate that we are partnered with all of those vendors as well on the storage side. And we actually, before I came over here, we actually were having an AIDP uh conversation with some of the product team at NVIDIA, right? Just to help understand where where certain players are and and how we can help support it and how we move forward with it. And then the other area that you touched on, I want to kind of talk about this from a customer standpoint, right? Um, so we have a customer that that we've been working very closely with, and they, you know, they built their AI initially for per use case, right? Like they were building up little islands of AI infrastructure per use case. And that was working, right? But they were doing it in a in a more traditional development lifecycle, like having dev test and then prod, right? And and that was working because they were deploying PCIe, you know, very traditional servers. Um, and now they have moved into this concept of they want to deploy an AI factory, and they understand that, you know, from a CapEx standpoint, they can't buy duplicates of everything, right, on the compute side, because it just doesn't make sense financially. So they have started to transition into this AI factory mindset. But what that means is now they're looking at this in terms of multiple users and multiple functional groups within the enterprise using the GPUs, and they may be using them differently, right? And understanding how they're going to consume those GPUs as a service becomes applicable, right? Are they actually going to carve out um a tenant, right? Where you You actually can, as a tenant administrator, you can stand up the infrastructure how you want it to look. You can, you know, have GPUs that are specifically assigned to your tenant, or maybe you're just trying to use an LLM that is running within this cluster, right? And use it as a service, right? Um, and so understanding those different dynamics within an enterprise and and knowing that different different functional business units are going to use it opens the door to that ecosystem where you have, you know, different layers of that stack helping with that multi-tenancy. And the concept is that like, you know, all the big cloud providers have been doing multi-tenancy for decades. How they do it and and the means by which they do it, it's not fully known by everybody, right? It's very proprietary. They all do it differently. Um, but as an enterprise, they don't necessarily have the same resources that those cloud providers do to build that layer of the stack, right? When you start to carve out a GPU cluster or an AI factory. And so we are seeing a lot of ecosystem partners that are that have a product off the shelf that we can, you know, be an evangelist for from WWT and help make sure that it fixes that customer's problem or fits that solution, right? And do it in a timely manner because understanding they just don't have the resources internally at the enterprise to develop something.
unknown:Yeah.
SPEAKER_03:Well, I mean, I can think that. Yeah, go ahead, John.
SPEAKER_01:And they really shouldn't, right? I mean, they why why reinvent that wheel? I think there is a healthy ecosystem. And that harkens back to something you brought up earlier, Derek, which is the differences between the cluster management and cluster provisioning, which is at an enterprise level kind of a holistic IT ops function, right? Then the workflow management or resource management that layers on top of that, how do I then make that cluster available to the various constituents within the enterprise? And then do I need to do that and manage it as a multi-team environment, which means it's not Coke versus Pepsi, it's you know, department A versus department B. So I can do things more loosely with virtualization or namespaces or other things, or am I talking hard tenancy where I'm actually carving this up and making it available to a collaboration with a research university in a healthcare use case that needs to be, you know, cordoned off? And then, yeah, that's NVIDIA is not providing that multi-tenancy. We're partnered with other ecosystem providers that are say doing that at a network level or at a at a at a virtualization or vCluster virtual cluster level or or in the Kubernetes. So there's again, it's like that takes that ecosystem. But to your point, when it becomes a shared resource, it's not the old ITOps mentality of I've got a common x86 farm and I'm gonna put a layer of virtualization on it and it's all generic. And so then I can farm it out and manage 80%. This is a very sophisticated, fully integrated solution stack, whether I'm doing training or inferencing, whether I'm using one tool or another. And I need that whole stack to be available to that team when they go to use this, whether it's for fine-tuning or testing inferencing at scale, right? And that gets to the other trend, which is agentic reasoning and and distributed inferencing and kind of that next wave of workloads that you know. I think everyone thought, oh, if I build the model with all this compute inferencing should be lightweight. And if you're running a proof of concept or a pilot and you're doing 10 concurrent sessions against a model, inferencing is pretty lightweight. When you roll that out and you're trying to do fraud detection on a million transactions per minute, it's not very lightweight and it needs to become distributed and enabled by a whole different approach. That's sort of what I think is emergent. And you know, some of this in some of what we build as NVIDIA and then make available is underlying technology that's going to be an enabler for that next gen use case. So, like if you look at the AI data platform, yes, it's very applicable to data curation, data sort of prep classification. Underneath the covers, though, it's also the foundation for distributed KV caching and distributed inferencing to actually put these large sort of context frames out there in a distributed fashion, manage at the storage layer so that I don't have to compute everything on the fly, and I can actually call on a much larger KV cache that's distributed out there with a framework like a dynamo, for example. And so, yes, you see everyone running to adopt ADP as the sort of intelligent data to feed one use case, but it's actually an underlying capability that's going to feed sort of the next wave of growth. And a lot of times that's where Nvidia is trying to innovate, and then we make it generally available because you know that benefits everyone, right? And and I think that's um that's sort of that's one of the reasons I love working at NVIDIA personally, and and it's so exciting, is because we are sort of seeing around the bend and then trying to bring that in and then make it broadly available so the ecosystem can kind of in enable it, which just drives everyone's success.
SPEAKER_02:Yeah. I I mean, John's talking about making it more distributed, making it more lightweight from where you sit right now, at least. How would we advise organizations or clients to move more towards that distributed architecture?
SPEAKER_00:So I think that we they I think we are just naturally right moving towards that type of infrastructure. Um, like there's only going to be so many people out there that can actually build and train uh large models. Sure. Right. Um there may still be fine-tuning out there, right? But we are naturally gravitating towards that distributed inference. Um, and and we're seeing it, right? Like, and and the fact that that now, you know, RTX 6000 Pro and the benefits that it has, not just for rendering and and videos and pictures and images, but also with um, you know, low precision point uh inference and and training, like, and the fact that it's very, very low power consumption, that just leans right into this same talk track that we're having um around inference for the enterprise. Because, you know, I still I still have conversations with enterprises on the facility side that you know, you look at MVL72 and uh and that rack scale solution, and it's 150 kilowatts, and and they're still looking at 10 to 15 kilowatt racks, right, in their enterprise, which leans more to that um the RTX 6000 Pro, right? And in terms of being able to actually use AI in their existing data center and not having to go either a retrofit or go purchase new or go look at a colo. So so I think we're moving that way. Um, and and NVIDIA again, I I think is leading the the charge because of the innovation. Um it's going to take time for enterprises to catch up. They're they're generally slower adopters, right? Um, and and life cycles for a lot of their hardware is three to five years, um, which you know we we need to work around that because the innovation lifecycle in NVIDIA is a is you typically 12 months, right? Like every 12 months something new is coming out, and so enterprises need to get on to that same train of thought so that they can keep up with the latest innovations. You think they'll get there? I I don't know. I don't know if they'll get there or not.
SPEAKER_01:I don't think so. And and I think you bring up a couple of great points. I'm really glad you brought up the RTX Pro 6000, the Blackwell Server Edition, because when we brought that Blackwell architecture and and into the RTX line, so you've kind of got the best of both worlds. That's by far the most sort of universal um sort of AI system that can also do general compute in a lot of ways, and also can do very well in traditional HPC and scientific applications. So for some enterprises that are like, I have to buy a multi-use type of system, I can't go buy a dedicated large scale, even if I could accommodate the 150 kilowatt. And if you get 150, you're lucky because it's more like 170. Yeah, true, true. But um, you know, this is it's also where I started talking about purpose built from a GPU affinity perspective. And these are conversations I'm having a lot more now because customers can't stay on the bleeding edge. So a lot of them are saying, hey, what's what are my H200s good for? And and how should I use these? Because I'm gonna use these three to five years. And when we talk about, hey, your more traditional HPC workloads, some of your scientific computing workloads, still H200 is still the best platform for that. When you talk about your training, your fine-tuning, yeah, Blackwell Systems, Core Blackwell, you know, HGX, DGX, H200 or B200, those are gonna be the sweet spot. When you talk inferencing, you talk inferencing at scale, like massive inferencing, that's what the the B200 NVL72 was built for. Like that's an inferencing behemoth. It's also a massive training system, but very few customers, to your point, are gonna adopt that. The RTX Pro 6000, that black hole server edition, absolutely is an inferencing machine. I don't think any enterprises that adopts one is gonna retire it when the next gen becomes available. I think they're gonna repurpose it, right? That I talk to customers like, here's my top 10 use cases. I'm buying for those. As soon as I've got those, there's a hundred behind them. And so I may buy next gen to support moving those top 10 from building to production deployment. The next 10 will go on what we were building those on as these adopt new, and then they'll just down tier, kind of like they have done historically. Yeah, we you know it's um, and there'll still be sort of the core. This is why I think flywheel is such a great expression for this. You're gonna have the hub of that, which is probably still gonna be large systems doing training, fine-tuning, even doing some initial um what I would call sort of inference um POCs, inference testing, how many concurrent sessions per GPU can I support to move it out to the edge of the spoke, which is where the lighter weight distributed inferencing is going to take place. Data's gonna be pulled back to fine-tune, and you start to get that sort of that flywheel effect. Um, and and uh you know whether it's it it is a difficult conversation right now on on the NVIDIA, you know, AI factory world. I'm talking about H200, B200, B300, GB200 NVL72, GB300 NVL72, which is coming, and RTX 6000 uh Pro, the Blackwell Server Edition. I I have all of those are in the conversation with a customer when we walk in, and they they say which one's right. And it it really depends. Um, and and frankly, for a lot of these customers, we're leaning in with with the RTX Pro that that black hole server edition is a great starting point. Many of them see that as that first build out of the factory while they you know decommission legacy systems, make available more space cooling power. Their plan is to bring in a core training system, like a B300 system, once they prove out this other capability incrementally, and then those six thousands will become inferencing systems and the training will move to the B3. So it's it's a spectrum, right?
SPEAKER_00:And more than ever before. Yeah, and I've seen a pretty nice table from NVIDIA that has all those GPUs that you listed and uh where they fit in terms of good use cases, right? Um, and and we support that 100% because we have all the in our AI proving ground, we uh we still have A100s in there, right? That are that are being used, right? So M100s, H100s, all the way up to to B300, we have, and and to your point, right? It's not about like throwing those out, it's it's about repurposing them, and even if you're just using them for test dev, uh they can still become useful if you have access to them. Yeah, so for sure.
SPEAKER_01:Yeah, and and you know, move your Jupyter workbooks for the experimental stuff, exactly. The new innovation, or look, it's all trading off time, right? So yeah, um, you know, if I need fast model iteration, I'm using the latest and greatest. If I've got something that's gonna improve a long-term manufacturing process, maybe I can take a month to train it on an A100.
SPEAKER_03:Right?
SPEAKER_02:Yeah, I mean A A100, B300, H200, you guys ever feel like you're playing uh GPU bingo here um or something? Um well, I uh we're we're running short on time, so I do want to be respectful of the time that the two of you have given um us and is definitely much appreciated. Uh John, start with you here. You know, maybe walk us through what do you think the future of the AI factory is here as as as things start to get more autonomous, edge-based, you know, pick your path. What uh what do organizations need to be thinking about? Where do you see the AI factory going in the future?
SPEAKER_01:That's a great question. Um so there's a couple of different veins we we could take that on. I I do think as we see the rise of agentic AI and and that being sort of smaller sort of expert AI systems that that are purpose built and work in concert with other agents, you're gonna see a natural tendency towards more distributed AI implementations. I think you're still gonna need sort of your core factory to manufacture those things. But when they go out and live in the wild and sort of operate more autonomously, or I always think of agentic as as describing giving AI agency, giving it the ability to act, whether that is to call on other AI or to make decisions. I think you're gonna see um uh almost a multi-tiered infrastructure to support that, where you've got kind of your core and then your edge, whether that edge is geo, you know, geologically on the edge, or if it's just construct in the same data center, but in an edge network implementation. I think that's that's one thing you'll see. I think you'll see um increasing interplay be between those systems and more dynamic learning. There's some stuff that I'm just reading about now that I'm not even gonna go into in terms of some of this new theory around how AI is learning and learning from itself and reinforcement learning and all these other things that sort of directionally are both exciting and terrifying at the same time. Uh, for enterprises, though, um, I would say being very mindful that you are, you know, in the process of deployment or of technology selection and implementation, you have this balance between uh sort of proven or known good, so an integrated full stack that you know is workable, while avoiding the lock-in associated with being overly vendor specific in that, um, because as we look to the future, how these things will be deployed, where they'll be deployed, is somewhat of an unknown. And if all of a sudden you're all in on one platform, it might be really challenging then to grow beyond this. We saw this in the past with certain you know investments in cloud native, and all of a sudden you can't go beyond that particular cloud native. So I think it will all ultimately be hybrid. I think it will be largely based on open source, but as it moves to production, you're gonna need a support sort of framework for that open. Um, but flexibility to me is something that has to be maintained. So, you know, making sure that there are considerations in the enterprise for look, there's always something to be said, going with your trusted partners, right? Whether that's a partner like a WWT or an infrastructure provider like a Dell or an HP or others, right? And so I think don't throw that away for this totally open because there's relationships and support and proven understanding and understanding of the enterprise that comes with those relationships. Just make sure you're not getting overly locked in to, in particular, a software stack or a given direction with that, um, that could limit your ability to be flexible in how you deploy this going forward. Because as you look at some of what's coming in terms of legacy content delivery networks now becoming inferencing edge networks and deploying GPUs at the edge, if you're not compatible with that, but you want to take advantage of it, are you going to re-architect all the way back to the core? Probably not, right? So it's one of NVIDIA's core principles is portability of code. Like that's the biggest thing, portability of code.
SPEAKER_02:No, I love that. I mean, Derek, uh, just to close out, kind of same question, but understanding what John just went through as you know, a representative of NVIDIA, a leader in the space. What are you seeing for the future? And how more importantly, how can organizations work their way into having that portability?
SPEAKER_00:Yeah. So so I'm not going to touch the agentic or the physical, right? Like we've already started having those conversations around physical AI, right? Um, but I think more specifically to the AI factory, um, we include a lot of times public cloud into that AI factory and that hybrid story into uh an AI factory. And what I see is that it, you know, everything was a single cloud, now it's multiple clouds for for most enterprises. Um obviously they have their on-premises environment and a lot of that enterprise IP data sits um on-prem. Uh, but what I also see, probably the newest addition, is that yes, they're gonna have this multi-cloud approach, right? Hybrid approach, but hybrid now also includes colos and neos, right? Because they're gonna gravit, as an enterprise, you're gonna gravitate toward wherever there's availability. And if it's not, if they're not going to get that availability in the public cloud for the GPUs that they need, then they're going to gravitate towards the NEOs. And it's a good thing that the NEOs are there because um there's an extra resource of GPUs readily available that can be consumed via a similar mechanism that enterprises are used used to in the public cloud and get the GPUs that they need. So I think as we transition, thinking, you know, 12 to 24 months out, there's going to be this trio of on-prem, public cloud, and uh neo cloud or GPU as a service provider that are a part of an enterprise stack, right? Uh or an enterprise go to market in terms of how they how how they support their customers, whether those customers are actual customers they're selling to or internal um employees. Yeah.
SPEAKER_01:And I think part of that's gonna come back to the very first analogy we used around the factory, which is which one of those has the right roads, right? So if it's if it's the railroad I need to bring the supplies in, my data, I'm on-prem largely, or or I know I've got enough of a factory investment that I'm gonna be able to use it, the cost effectiveness of you building a factory versus contract manufacturing is always gonna be better. But if I've got you know an environment where I need distribution roads out to all the small towns that I want to send my goods to, why wouldn't I use a distribution like a public cloud? already has all those endpoints built and I'm putting my my you know actual application integration out there or if it's somewhere in between where I want to maintain open I want to maintain control but I don't want to build it the neo clouds are great because they are you know in their very nature open they're gonna allow portability they have sort of that best of both worlds in terms of the as a service implementation with the ability to bring your full stack to them in many ways. And you know they're they're building out you know they have the major highways if not all the arteries that go out from there right so it's but consider you know the mechanical the the physical where's the data data locality becoming increasingly important and then the overall distribution network that you're connecting into you know the the roads to out of the factory are now you know the last mile endpoints and the the the things that have largely been what the internet's been about historically so exciting times though I mean and yes we didn't touch on physical AI I'm I'm doing a lot of work in that space um that that's a fun one uh but you know we'll we'll see we'll save that for uh for the next episode you've been uh a pleasure to have on so 100% we'll be giving you the invite back uh John thank you so much uh for your time out there in uh Oakland no my pleasure thanks for having me uh WWT's been a great partner of NVIDIA been a great partner of mine for um two decades before I ever even came here so uh happy to do it and look forward to future conversations.
SPEAKER_02:Yeah and Derek as well thank you for joining I know you're out west but you're here in St. Louis today so thanks for joining. Yeah thanks Brian thanks John Derek great to see you good to see you. Okay what we heard today from both John and Derek is that the AI factory isn't a product it's not a rack of GPUs or a single model or workflow tool. It's an operating model for how intelligence gets built, deployed and scaled inside an organization. And the key insight is this enterprises that succeed will be the ones that learn to operationalize AI as a shared secure flexible resource across the business a factory with the right roads leading in and the right roads leading out. This episode of the AI Proving Ground podcast was co produced by Nas Baker and Kara Kuhn and Diane Devery. Our audio and video engineer is John Knoblock. My name is Brian Pelt. Thanks for listening and we'll see you next time
Podcasts we love
Check out these other fine podcasts recommended by us, not an algorithm.
WWT Research & Insights
World Wide Technology
WWT Partner Spotlight
World Wide Technology
WWT Experts
World Wide Technology