AI Proving Ground Podcast

Hidden Infrastructure Demands of Enterprise AI

World Wide Technology

As AI pushes the limits of traditional IT infrastructure, enterprises are racing to modernize their data centers. In this episode, Mike Parham and Bruce Gray walk us through the behind-the-scenes decisions that matter — from power and cooling challenges to GPU readiness and sustainability. Whether you're modernizing or starting from scratch, this conversation is your blueprint for AI-ready infrastructure.

Support for this episode provided by: Vertiv

More about this week's guests:

Bruce Gray, a results-driven IT executive at World Wide Technology since 2007, brings 25+ years of experience in Building Automation, IT, and Telecom. As Practice Manager, he leads business development and execution for data center design and IT facilities infrastructure. With a background in architecture, programming, and electrical engineering, Bruce excels in strategy, project management, and vendor relations—always seeking challenges that drive impact.

Bruce's top pick: Data Center Priorities for 2025

Mike Parham is a Technical Solutions Architect at World Wide Technology, where he has been delivering innovative IT solutions since 2011. With a strong technical background and a focus on aligning technology with business goals, Mike helps clients design and implement scalable, efficient architectures. He is known for his collaborative approach, deep expertise, and commitment to driving successful outcomes.

Mike's top pick: AI and Data Priorities for 2025

The AI Proving Ground Podcast leverages the deep AI technical and business expertise from within World Wide Technology's one-of-a-kind AI Proving Ground, which provides unrivaled access to the world's leading AI technologies. This unique lab environment accelerates your ability to learn about, test, train and implement AI solutions.

Learn more about WWT's AI Proving Ground.

The AI Proving Ground is a composable lab environment that features the latest high-performance infrastructure and reference architectures from the world's leading AI companies, such as NVIDIA, Cisco, Dell, F5, AMD, Intel and others.

Developed within our Advanced Technology Center (ATC), this one-of-a-kind lab environment empowers IT teams to evaluate and test AI infrastructure, software and solutions for efficacy, scalability and flexibility — all under one roof. The AI Proving Ground provides visibility into data flows across the entire development pipeline, enabling more informed decision-making while safeguarding production environments.

Speaker 1:

In the rush to build the future of AI, there's a crisis quietly unfolding in the background. Our infrastructure isn't ready, power grids are straining, cooling systems are maxed out and enterprise data centers built for a different era are now being asked to support workloads they were never designed for. On today's show, I'll talk with Mike Parham and Bruce Gray, two experts on WWT's facilities infrastructure team who've been inside the rooms where those hard choices are made, from power density to GPU supply chains, to the geopolitical cost of inaction. They'll pull from personal experience and take us deep into the real world challenges of building infrastructure in an AI-first future. This is the AI Proving Ground podcast from Worldwide Technology Everything AI all in one place, and today's episode isn't about hypotheticals. It's about whether your organization is ready or already behind. So let's get to it. Okay, mike Bruce, thanks so much for joining us on the AI Proving Ground podcast today.

Speaker 3:

How are you doing? Good Doing well, happy to be here Doing great Thanks for having us.

Speaker 1:

Hey, Bruce, I got to ask you you know, when you hear that a single AI cabinet can now draw more power than a single US home, what are the infrastructure dominoes that fall just from that simple statement?

Speaker 3:

Yeah, well, we work on six disciplines and that triggers four of them right away power, cooling, space and cabling. So me and Mike get pretty excited about that.

Speaker 1:

Mike, any reaction to that? I mean, what are the implications I'm imagining? I'm going back a couple months ago to NVIDIA GTC, and NVIDIA is announcing new racks, rubin Rubin Ultra, that can go up to 600 kilowatts of power. I mean, are you sitting there on the sidelines with your jaw on the ground?

Speaker 2:

both here at Worldwide and before we joined Worldwide. So when I first got in the industry over 20 years ago, racks were probably in that 2 to 4 kW range, right 2 to 4,000 watts. We could do that with single phase, traditional 120 or 208 volt power. You know it slowly started to migrate to three phase and, you know, up until recently, 20 kW and below was, you know, considered high density. Now you know, some of these rack loads are 30 to 50 KW on the low side, you know, over a hundred KW and, like you just mentioned, 600 KW and beyond coming.

Speaker 2:

So it's, it's crazy how fast things have changed. Uh, because it took 20 years to get to that 20 KW mark and a matter of a year, year and a half. It's well, well, well surpassed that In a matter of a year, year and a half. It's well, well, well surpassed that. So it's exciting because our you know our area, I think has been slow to change. Right, data centers are built for 10, 15 years or longer, but that's not the case anymore with what we're seeing from these new NVIDIA chips and other AI loads.

Speaker 1:

Yeah and Bruce talk to me a little bit about the implications that it has on the data center. Obviously, 600 might be a little bit off in the horizon, but it's coming. What does the 15 to 20, even the 60, what are the implications that this is having on the standard enterprise data center?

Speaker 3:

Yeah, well, it's nice. It always talks about disruption. Well, we're finally the disruptors, right. So it's going to disrupt the way you bring power to your data center.

Speaker 1:

It's going to disrupt the way you code, obviously, and it's definitely disrupting things that people don't think about by the space and the cabling, the cabinets are getting bigger, the square footage needs to increase, the aisle distance between cabinet to cabinet has to increase, so it's total disruption in everybody's space, even at the lower. Handle these AI workloads, if any, and how many are on a path to being able to support those AI strategies?

Speaker 2:

Yeah, you know I would say very few, right? So we see data centers, the traditional enterprise data centers or customers on-prem data center site, and then you see the hyperscales right? So the hyperscalers right, they're building these facilities to be able to handle those loads. But it's going to be interesting to see how the traditional enterprise data center can support something like this Because, again, when Bruce and I go out there, it's sometimes difficult to do 20, maybe 25 KW. Well, that might get you one AI box right or two. So I think it's going to be really challenging to see data centers that can accommodate that. You know the on-prem load.

Speaker 2:

So what we're really working with customers to do is, before you even start talking, that technology that's going to sit in the rack, stop, let's look at the facility, what can and cannot do? So we have assessment services to come out on site to understand and Bruce has just mentioned it the power coming into the building. Is there even enough power coming to the building? Do we need to go talk to the utility company? What inside of that power chain needs to possibly get upgraded? Right, so you got power coming into the building. It needs to make it all the way down to the data center and how many devices along the way do we need to possibly change. So that's where a real thorough assessment is going to look at that, to understand what's the implication from the IT side, from the AI that's going to be on-prem. What can we do in the current data center? Maybe optimization and adding some newer technologies, but ultimately we might have to increase the power coming to the building increase the power coming to the building.

Speaker 1:

Yeah, Bruce, and those don't sound like easy changes to make if you have to bring in more power or add cooling or raise a floor or do whatever it might be to make sure that you're able to support that AI strategy. But are these things that are long time coming or do these things take time to make changes? And certainly there's probably a lot of costs associated with it as well.

Speaker 3:

Yeah, you've said it all right there. It is a long time coming. One of the things Mike and I've been trying to educate on is that you need to design or you need to, you know, to plan for your facilities now, at the same time that you're planning them for your AI. I've heard recently that some of the new chips are about a 36 week delivery. Well, generators, switchgear, crack units, any of the cooling, that is longer than that in some cases. So if you wait too long to design and to do this facilities uplift, you're going to miss the boat on the timelines.

Speaker 3:

So and you mentioned cost, the cost impact obviously, right. I mean and you asked the question earlier, mike answered it Let me go back to that for a second how many customers are ready? I mean none, right? Because if we're changing the way we apply the power, that has a whole upgrade. So when they're looking at this, they're defining just a pocket of where they need AI and not the entire data hall, right? So we talk about legacy operating systems and how we migrate and make them efficient. So that's our best asset is to go in and prepare the customer with their existing legacy so that they can move into the AI, area that they need in a slower fashion.

Speaker 1:

And are we talking only on-prem here, or are we also talking about there's facilities implications when we're talking about colo or hybrid or any other way you would run an AI workload, or are we strictly talking just on-prem deployment?

Speaker 2:

Yeah, typically Bruce and I are getting involved for the on-prem right. So there's other teams here worldwide to kind of help with that GPU as a service. Where am I going to put that load first, if it's not going to be in my data center and I think that's going to be the case in most customers' step approach where it's going to start off-prem first and then, when it does come on site, what does that rack load look like? Is it a little bit different than what they were doing, training at a big GPU as a service data center? When they bring that inference load on on-prem, there's still probably going to be, you know, major changes that need to be required.

Speaker 2:

But we do have customers, whether it's in the education or manufacturing space, that are doing some of that training, some of that high-dense load on-prem today. And, like we've all been talking, there's no data centers out there traditional data centers anyway that are ready for these loads. So Bruce mentioned the power coming in. Not only do we have to possibly upgrade the power, but just the type of power that we're taking to these racks is different. Right, in the US we're traditionally 480 volt coming in and it's typically getting stepped down, but now possibly we're sending that 480 volt directly to the rack and using 277, or we're doing some DC busway in the back of the rack. So there's just a lot of changes other than I just need more power and cooling.

Speaker 3:

Well, I was going to say you know Mike's talking about the colos and you had asked the question who's prepared where they're trying to prepare, because that's where many people are going to have to go. But if you think about it, almost all colos are at a tier two level, and so this is requiring a much higher tier level. And so now they have to all retrofit and move to not only new power, new cooling, new layouts, but they also have to move to the tier levels, which is redundancy, for safety and battery.

Speaker 1:

Yeah, well, we've talked a lot about power thus far, but there are three other areas that we have in the research that we've done here at WWT, talking about cabling, cooling and certainly space and physical area. Mike, can you just dive through, give us a little bit of a brief on why those things are important to consider and maybe a little bit about what a lot of us are not thinking about when it comes to those areas in AI.

Speaker 2:

Sure, yeah, I'll tap over, maybe, the cooling in the space and have I'm sorry the cooling in the space and have Bruce help with the cabling. So, cooling perspective if we just kind of rewind a little bit and look at a traditional data center, the way they've been designed for a long time is with what's called a raised floor approach. That's where I had the air conditioners that sit up against the wall. I'm pressurizing the raised floor, putting my cold air down there, and then that cold air is coming up from the raised floor in front of the racks. That's my cold out. That approach is a solid approach. It's been very reliable for a long time and we can optimize it. We can get fairly dense with that approach and there's some newer technologies whether that's fan, wall or others out there. We can get fairly dense. But probably rule of thumb is let's stop around 15 to 20 kW with that approach. And if we need to go higher air cooling, there's newer technologies. There's things that's called in row or close coupled air air conditioning, where I can put those AC units instead of having them sit 30, 40 feet away and hope that my hot air migrates over there, I can put those air conditioners right next to my IT cabinets so they're grabbing all that hot air as soon as it's produced. They're providing that rack cool air on the front side and we can go higher dense than that right. So that'll probably get you maybe in that 30, 40 kW per rack. We can start doing rear door, which is the same concept of that row base, but now I just turn it sideways, I put it on the back of the rack so as all that heat that's generated by the GPUs, cpus, is expelled from the rack, it's kind of trapped and it has to go through that rear door. So we're going to neutralize it and we can get fairly dense. With that approach we can get up to maybe 50 to 80 kW and perhaps beyond. So the reason we want to kind of start with air is and I'll hit liquid in a second but the reason we want to start with air is we need to have that environment well-tuned and optimized, because air is not going away. Even when we start doing liquid cooling. Oftentimes that's only going to cool maybe 80% of the load. So if we have a load right now some of the racks out there are 132 kW. If I have to cool 15 to 20% of that with air and I have just a traditional data center with raised floor, I'm still perhaps going to struggle right. So that's why we like to come in and do a CFD analysis to truly understand what that facility can do If we do start to do liquid cooling right.

Speaker 2:

There's a bunch of different options and there's a bunch of different ways to do it, but I think some of the common ones that we're seeing customers approach is called a direct liquid to the chip. So instead of immersion, instead of putting those devices in some type of fluid, I'm going to take fluid directly into the server chassis, directly to the GPU and CPU, and that fluid is going to remove my heat. Now I can do it positive pressure, negative pressure. That fluid could be what's called a single phase or a two phase. So again, there's several options out there, but that's how we're really going to get those racked entities.

Speaker 2:

And I know a lot of times when you talk about liquid to the chip or liquid inside of the rack, it's a little scary, you know. The truth is there's always been water in the data center, even when I have air conditioners that are up against the wall and 30 feet away from my IT equipment. Those air conditioners have to do rehumidification, so I'm always taking water. It's all about how do you manage the risk? In-row typically was always chilled water, rear door can be chilled water. So again, there's always water there. But now we're introducing it directly into the server and we're doing that to really kind of capture the heat. Right as it's created, those devices are getting so hot that just blowing air across them simply doesn't work anymore, right? So that's kind of the cooling.

Speaker 2:

And you know Bruce had mentioned on the space, our aisles are getting bigger.

Speaker 2:

So traditionally what was called your hot aisle, cold aisle right, cold aisle in the front of the rack, hot aisle in the rear of the rack, I would say minimum four feet on both of those right.

Speaker 2:

That gives you enough space to open the doors, put equipment in there comfortably work. Whether it's ADA requirements as well that you have to consider. But now if we put a rear door on an IT cabinet and oh, by the way, that IT cabinet is probably getting wider and deeper, right, so traditional rack was maybe 42 inches deep. Now they're probably at least 48 inches deep and perhaps even going deeper. But behind that rack if I put a rear door on it, that's probably going to add a foot, maybe a little bit more, a foot and a half perhaps. And if I have two rows back to back, both of those just got a rear door added to it to accommodate that 20% airflow that I still need to cool. Well, I've just chewed up three feet of my hot aisle, right, so that's where you know. When Bruce said earlier, some things that we might not think about is my aisles are going to have to get larger to accommodate different types of cooling. Perhaps that I have to do in the data center.

Speaker 1:

Yeah, Bruce, take us to that second half. Talk about cabling and the space considerations.

Speaker 3:

Yeah, and let me, I'm going to jump on this space with Mike. There, you know he said it the cabinets are 42 inch deep and the new hardware is 41. You can't close the door once you connect it right. So they have to go deeper of cabinets. So back to that disruption. Now you have to change your cabinets. You can't reuse those existing cabinets, and so that brings us into the cabling.

Speaker 3:

So what is the cabling? Well, it's denser. There's a lot more of it required and what's existing is probably not capable of doing the speed and bandwidth and the latency that people were looking for, right. So you're looking at, most people probably don't have more than what they started at 10 and 40 gig. They might have 100 gig. Some maybe bigger customers will go 400. Well, we need 800 gig speeds on this network, right, and we need to use OM4 and OM5 fiber in multi-mode so that we can pass more light and we can pass it quicker, right? Those kinds of things are very important.

Speaker 3:

You think about the massive amounts of data sets that are going to be passed from GPU to GPU or GPU to storage, right, and from. You know, mike and I don't touch on this very much because we're not the IT guys. What's the latency requirements there? You know, can we spread these to maximize power and cooling? And in most cases, because of what they're trying to do, we can't. So the latency is very important. So these workloads need to stay close together. So then you know the amount of cables that people don't consider like. Go back to the Colo conversation. You buy a space and they already have the cable trays. Well, those cable trays are not wide and deep enough to maximize the 8,000 cables you need per cabinet. Now, right, they're made for a couple hundred cables, and so the weight load, the density, the code requirements you have to follow ANSI requirements of 40% fill ratios. And that's a problem with the customers are not thinking about it.

Speaker 1:

Okay. Well, bruce, you talk about going from 400 cables to 800 cables, so that's adding space. You talked about the depth of these racks going from, you know, 40 to 41, 42 inches. What else is taking up space, and why is space such a concern? I guess it's just the fact that you have a certain amount of capacity within a data center. What else do people need to be asking themselves as it relates to space constraints?

Speaker 3:

Yes, yeah well, mike's talking about the advanced cooling. You know he's talking liquid to the chip and all that. You think about that. If you're using an existing cabinet, which he said was two feet by 42 inches, and we're trying to use the same power, you need more PDUs, right? Each PDU can only handle a maximum load. So to get to those density levels, you got to keep increasing the amount of PDUs. So where do you put all these PDUs? In a two foot by four foot cabinet, right? You put all these PDUs in a two foot by four foot cabinet, right.

Speaker 3:

Then you talk about the liquid cooling. Where do you add the manifold that controls that fluid? How does that fit into this cabinet? And then back to the you know four 8,000, 8,000 cables. How do you manage that? How do you get all those into that cabinet as well? So all this becomes a problem. And if you don't manage the cable incorrectly, what you're going to have is you're going to have an airflow problem, which gives you a heat problem, and it's going to give you a maintenance problem because you can't work on the gear if the cables aren't managed correctly. And on top of all that, what people really forget about is the labeling. Every cable needs a label. Well, think about adding that mess to the world if you don't plan it.

Speaker 2:

A cable, a nightmare, a label nightmare. This episode is supported by Vertiv. Vertiv offers critical infrastructure solutions to ensure data centers reliability and efficiency.

Speaker 1:

Keep your operations running smoothly with Vertiv's robust infrastructure products. Mike, take an average customer of ours. Are these questions that they're already asking themselves? Or are these kind of aha moments along their process where they're like oh, I didn't account for that, I didn't account for that, and now it's becoming more and more of a bottleneck and delaying AI strategy?

Speaker 2:

Yeah, I think a little bit of both, but probably more of a surprise to most right where they didn't realize how dense that these racks truly are from a power perspective and what they're going to have to change. Again, if we're backing up outside of the data center, do we have to change the power coming into the data center, meaning I need more and more capacity Once I'm in the data center? We kind of reviewed different technologies, right, I'm certainly going to have to change the type of power, the redundancy level. Do I need a third UPS in some cases? Do I need to put in some overhead busway? Do I need to take different voltages to the rack? So there's a lot of changes that I think customers aren't expecting Because, again, if we back up the, you know something I had said when we first started discussing this.

Speaker 2:

You know data centers typically had a life of 10, 15 years or longer, right?

Speaker 2:

So that traditional space of being able to handle 20, 25 KW per rack, 208 volt to that rack, that's worked for a long time and it's not working anymore.

Speaker 2:

So I think it's going to kept some certainly by surprise and I think having that conversation at the very beginning is really really important If we're going to talk about any type of HBC or AI on-prem in a traditional data center, let's stop and first look at do that assessment, first understand what can the space accommodate? Because you know I think Bruce had brought it up about lead times on devices. You know also lead time on do I call the power company and get more power to the building? What kind of lead time is that right? So not only lead times on maybe new generators and ATSs and switch gear and all that good stuff, but just getting the utility company to bring me more power. And we've had conversation with customers about and Bruce does a great job in our AI day presentations about metering behind right, managing behind the meter. Do I have to bring my own power? Do I have to put power on-prem, meaning some type of microgrid, because the utility just can't get any power fast enough?

Speaker 3:

Yep, go ahead, bruce. Let me add to what Mike said. Mike and I do the. You know we do the AI day presentation along with all the others, and ours is the scary presentation, right Is your data center the Achilles heel, and what we do is we scare them. We tell you that it's going to impact and everything we're talking about so far how it's going to impact power, how it's going to impact the cooling cabling in the space, right and so the feedback we're receiving is exactly that. Those customers are not planning and thinking about that, especially the designing these in parallel aspect. Many customers have come up to Mike and I after and we've had a lot of follow-on workshops now talking about the preparation for AI and the AI readiness assessment that Mike's referenced a couple of times, and so, yeah, the customers aren't really preparing and they're being surprised by some of the concerns.

Speaker 1:

Yeah Well, recognizing that this could be a bottleneck and it could lead to hidden costs, which nobody likes. How would AI? How would those that are determining AI strategy message to the board or to executive leaders that this facilities conversation is one that we need to have upfront and early, and often Should they just be saying, hey, this could significantly set back our strategy if we don't account for it? Or have you found any success in clients that we do work with in terms of messaging that?

Speaker 2:

Yeah, it's. It's certainly going to to need to happen. Right, the messaging does need to come. And where does it come from?

Speaker 2:

Because a lot of times you still see maybe different types of struggles inside of the customer, where you have the facilities camp and you have the IT camp and sometimes you know one doesn't know what the other one is doing or, you know, doesn't understand the technologies on either side. So if I'm strictly on the IT side and I don't understand that my traditional data center can't cool 132 kilowatt rack, I'm getting ready to slide in, right, I may not even bring that up. And then on the facility side, you know, if I'm just relying on the way we've done data center for 20, 30 years and not knowing that what's coming from the IT space is now 20 times the amount of power, you know that's that. That's a big disconnect. So I think the answer is education, right, education, and making sure to have these conversations very, very early. You know, and that's where we can do workshops, we can do that education track to make sure that both sides, both parties inside of that customer truly understands how each is going to impact the other.

Speaker 1:

Yeah, and Bruce, these aren't lessons that we're just kind of, you know, making up here. We've actually learned some of these lessons ourselves. We, you know, we have created our own AI proving ground, which, you know, of course, is what our podcast here is named after. Walk me through the journey that we had with the AI proving ground and perhaps some of the lessons that we learned, either the easy way or the hard way, and how that would relate to some of our client journeys as it relates to their AI strategy.

Speaker 3:

Yeah, and they made a commitment to build the AI proven ground in the one data hall that we have HTC One and when they went to do this, the amount of power was not available, just like Mike has explained earlier. So we're a customer and we want more power. So we go back to utility and ask for more power, which is a simple solution until you find out the utility can't deliver power for 18 to 24 months. Right, how do I get my technology rollout without that? So what we have to do is shift workloads. Right, they had to move things around. We mentioned we touched on briefly what are we doing with our legacy power and cooling models? How do we make them more efficient? And that's just what our team had to do over there. Mike and I aren't part of that team. The ATC team has their own group of people that make this happen, but that's exactly what they had to do. They had to shift workloads. They had to consolidate, optimize to be able to bring that additional power. They needed to build out the AI program.

Speaker 2:

Yep, and something to bring up there and Bruce mentions this a lot when we do these AI roadshows is what about Billy New? Is it going to be quicker and easier and better? Perhaps, if I just kind of not necessarily abandon my data center, maybe I'm going to use that for my low density on-prem loads that I have. But I'm just going to start from scratch, and we've seen a real kind of boom in that space with modular prefab data centers where a customer can have a data center stood up maybe quicker and easier than constructing a whole new building. So there's partners out there that we work with and I think it's a real value that Worldwide adds. We can certainly architect it and bring in the best debris technology from the IT side of the house, but then we can do that same thing with the facility side and get these modular prefab data centers completely integrated and turnkey for the customer, so now they can stand up this HPC AI environment a little bit quicker, as opposed to waiting for a new data center to be built.

Speaker 1:

So is that taking kind of a greenfield approach there, then, with those with that modular approach?

Speaker 3:

Absolutely Yep you bet If you look at an analysis, if you do an AI ratings assessment and you've got to decide what, what the customers, the customers that decide what their goals are. Is it speed? Is it money? What is it that's the roadblocks for them? To retrofit a data center definitely takes longer than building new. It's not cheaper. Usually building new is more expensive, but it gets you to your end goal quicker. So if speed of deployment to get the customer results you're looking for, then Mike's spot on there with modular and new builds.

Speaker 1:

Well, whether you're retrofitting or going the greenfield route, you know, if data center life cycles were taking that 10 to 15 years, mike, that you'd mentioned. Well, the AI hardware, you know landscape feels like it's cycling out, you know, every couple weeks, if not shorter. So how can you maintain flexibility in that environment, knowing that things are always going to be changing on the horizon? Or is it just a game of catch up for the time being?

Speaker 2:

Yeah, great question. Because six or seven months ago there was questions about how do we handle these 130 KW or 200 KW racks and the technology in some cases wasn't even available yet, in some cases right. So if we're going to now plan a data center to handle these AI loads, that again maybe 50, 60 KW on the low side, a little over 100 on the high side, but knowing that 600 KW is coming, that's a real challenge because maybe there's not even a solution out there and I think there is right. So there was MoEMs showing some technology that handles 600 KW per rack. But how common is that going to be to have customers have that 600 kW rack on-prem in their site? Is that something like that, maybe just going to live at these hyperscale kind of colo environments, or is that going to be on-prem in a traditional data center?

Speaker 2:

So I don't know the answer to some of that and I don't know if our customers are understanding that as well. But yeah, to kind of answer your question, that 10 and 15 year lifecycle has been the way for decades, Right, but now that AI is changing in, the requirements put on, the facility is changing so dramatically. It's not just that, it's doing it in a quick time frame. It's just that the capacities are a drastic change from 20kW, 30kw, maybe traditional, now we're talking 130kW, now up to 600kW. That I think is a real challenge to plan for Life expectancy in my data center to go 5, 10, 15 years.

Speaker 1:

Yeah, Bruce, anything to add from your end in terms of trying to strive towards that flexibility, knowing that things are changing all the time in the business landscape?

Speaker 3:

Well, one of the interesting things is chipsets and GPUs are moving faster than the facilities right. In every industry out there, there's a regulatory commission or a governing body that sets standards and parameters and laws and regulations, and that's not around yet. That has not happened. There's nobody. Ashrae and others have not stepped up and said well, this is how we're going to do liquid cooling, this is how we're going to cool 600 KW. So the OEMs that are out there, they're still trying to decide what's next. So it is going to be a stumbling road for a little while until we get more standards. Let's just say right, what kind of liquid are we going to use? What kind of connectors are we going to use? What kind of other technology that's not even developed are we going to be able to use? So it's an interesting time right now.

Speaker 2:

Yeah, and I think also that which might help with customers is I don't know if we're anticipating the entire data center. Again, I'm not talking, you know, the hyperscale GPU as a service kind of environment. I'm talking traditional on-prem. I don't think that's going to be the entire data center, right. So maybe silver lining good news is, my entire data center doesn't need to support these drastically different increased loads. Maybe it's a few rows, couple rows, one row perhaps, or just a few racks a few rows, a couple rows, one row perhaps, or just a few racks. So it's going to be interesting to see how that evolves and how quick customers bring this technology into their data center and how much of it.

Speaker 1:

I did want to dive deeper into the liquid cooling aspect that you know. Several months or a year or so ago that felt pretty exotic, I think, in most cases, but certainly becoming more and more common and we're seeing a lot more vendors hit the landscape. So I'm just wondering if you can kind of break down exactly, you know, what we're seeing in the liquid cooling space and is there any more innovation coming down that line that'll help maybe have a breakthrough for, you know, for handling some of these workloads.

Speaker 2:

Yeah, so I you know. I think the good news is there were a lot of options out there and I think we're going to start to see some consolidation. So some of the manufacturers that we work with have already started that process of acquiring these technologies and I think it was a little bit of a hey, let's wait and see what's going to win the battle from a, do I go liquid? To the chip? Do I do immersion? Do I do single phase or two phase? Do I do positive pressure, negative pressure? So, to your point, a lot of this was exotic. It was really something that most customers weren't using, right, even when Bruce and I would talk about, you know, more advanced air technologies, that's still in some cases not common and a little exotic. When we talk about hot aisle containment and rear door heat exchangers, right, that's not common in most cases. So if we're introducing liquid into the mix right at the IT rack or inside the IT rack, that's certainly a little bit different. So I think good news is, again, we're going to see some consolidation of some of these partners, some of the options. I think we're going to see some, some standards set on the other side of the fence, meaning the IT folks. Right, the GPU, the CPU makers are going to probably start to choose their favorites GPU the CPU makers are going to probably start to choose their favorites. What we've started to see most designs kind of settle on is a direct liquid to the GPU CPU and it's got positive pressure. So that, I think, is going to be what we're going to see the majority of, maybe for the next year or so.

Speaker 2:

The advantage of maybe, if we're looking at different types of liquid cooling, so that immersion concept, where I'm putting the entire device into a tub of liquid, I'm capturing all that heat, right? So we talked about it earlier. The reason why we want to maintain a well-optimized and tuned air environment is that I still need to cool perhaps up to 20% of my heat load. That's not getting cooled currently today by direct liquid to the chip or GPU. Right, if I do immersion, I'm capturing all that heat. A little more exotic right, I've got to have additional devices in there. My floor is now laid out a little bit different because the rack is not standing up, it's laid down on its side. So some customers are using that. You know, like Bitcoin kind of mining technology. We're seeing customers like that use it. I don't know if that's going to be very popular in traditional enterprise data center space. Again, I think we're going to see more of that kind of DLC environment.

Speaker 3:

Yeah, Bruce, any innovations taking place, whether it's in the liquid cooling or the power, the cabling, the space that give you hope that we'll be on the right path here to tackle some of this stuff. The thing is converting to DC is a possibility now with this. You know, OCP rack and having this bus bar. I was just at a meeting this week that that that was a high topic and that how they're going to convert you know other the rest of the technology is around to run DC which is not in a traditional data center. I mean that's using the telco space.

Speaker 1:

So that's that's one that I think is gaining a lot more traction right now. Let's say that, you know, a customer of an organization is beyond the, the power, the cooling space, and and and the concerns that we've brought up so far today, what lie? What are the challenges that lie beyond that? Is it a talent question? Is it a sustainability thing? What, what, what are the more concerns after the fact?

Speaker 3:

there's a huge talent shortage, right? I mean, I think in my presentation I have one fact stated by the governing bodies is there's over 400,000 technician spaces open in our area. I'm not talking about IT technicians, I'm talking about facilities technicians. So I can relate that, say back to cabling is people don't value cabling as high as it should be and if you don't get a quality technician on the cable, you may be putting something in that doesn't get you what you need. Right, you need those quality technicians to come out and test and tune so that you get that bandwidth you think you have and you get that latency reduction and so forth. So there's definitely a technician shortage.

Speaker 1:

Yeah, Mike, what about you? What do you think lies beyond those four main concerns that we've talked about?

Speaker 2:

Yeah, you know. Maybe something else that we haven't talked about is are there going to be any new regulations set that are going to impact customers? So if I do decide to go greenfield right, I've decided that I just cannot optimize what I have now. I need to start from scratch. Do I have to comply with any new regulations for data center builds right? Is that going to become more of a topic and more municipalities and states are going to require you to put some type of on-prem generation of your own power, whether that's fuel cell or air or, I'm sorry, wind? You know that could be something that's going to certainly be a challenge and again add to the complexity of designing this environment and perhaps even lead to this law, mike.

Speaker 3:

It's a great one there, because we know that there's bills out in many of the states right now that are exactly requiring that. They're saying that when you permit a new data center, you're going to have to create X behind the meter, and that's where the innovation is going to come. Everybody knows about wind and solar, but what about nuclear or hydrogen, or fusion, geothermal, even wave generation? Where are these data centers at, and what kind of power is going to be required from each government agency? So that's a great point, mike.

Speaker 1:

Well, moving forward. I'll give you another headline here. This just happened a couple of weeks ago. So Jensen Huang, nvidia CEO, as good of a North Star as we have these days in terms of understanding where the AI landscape will go says that every company will need an AI factory moving forward. So how do you see that fitting into the future of data centers? Is it just keep considering everything we've talked about, or is that even going to push the envelope even further?

Speaker 2:

You know, I think it's certainly going to have to account everything we just talked about. And again, is that AI factory? Where is it going to live? Is it going to live on-prem or as a service type environment, because that's perhaps the easy button for a customer to do that, to get that AI factory. I'm going to do it off-prem, to where I don't have to worry about all those challenges and constraints of my own data center. If they are going to bring something like that on site, then absolutely we need to have these conversations because there are going to be big impacts and we need to understand what they can and can't accommodate.

Speaker 3:

Mike and I surely aren't going to say Jensen's not right, so but I think it's going to be. I think you hit it, mike. It's going to be shirt sizes, right. It's an AI factory. I'm Mr Small Customer, I need small AI factory because I want to do X, y and Z, or I'm a large consumer and I need a bigger.

Speaker 1:

AI factory. That's, I think, will happen a minute, but is there any questions that you know that I didn't ask here today, that you would ask of one another in terms of getting our listeners organizations out there that are understanding how to advance their AI workloads? What should they be asking or what questions would you ask of yourselves that would really advance this conversation?

Speaker 3:

I was going to say, mike and I are the how people. You come to us and we figure out how to do it right. So the questions will be why are you doing it, what are you doing and where are you doing it at right? What's your goals, what's those things? So we got to figure out exactly what you're trying to accomplish right, what's the goals, what's the challenge, what's your business case and then what you're doing it with.

Speaker 3:

Mike's already said a few times is it a whole data center for AI as a service or is it just? I need faster Google searches and I'm going to need a small AI factory. And then, where do you plan on doing it? Is it onsite or somewhere else? Once we know the why, what and where, mike and I get to the how. So that's always where we want to start with and Mike said it a couple of times earlier the readiness assessment. Mr Customer, these are very important things today. We need to baseline, we need to find out what you have, and then we can apply the why, what and where to that. So that's what I would say.

Speaker 2:

Yep, and that's what I was going to bring up too the assessment for sure, and making sure that the team, the customer's team, is fully communicating on both sides of the house. So facilities and IT need to come together to have these conversations and understand how each other is going to get impacted with some of those new technology.

Speaker 1:

Cool. Well, mike Bruce, thank you so much for joining us here on the show today. It was very helpful, very insightful and hopefully helpful to those out there that may not be asking these types of questions within their own organizations. So thank you again. Thanks for all the work you do for WWT and hopefully we'll have you on again soon. Sounds good, thank you, appreciate it. Okay, great conversation.

Speaker 1:

So let's talk about some of the things we learned. First, ai isn't just a software challenge. It's a physical one too, from power and cooling to real estate and network architecture. Scaling AI means rethinking the entire foundation of your data infrastructure. Second, the organizations moving fastest aren't just adding GPUs. They're planning holistically, anticipating density spikes, supply constraints and sustainability requirements before they become business blockers. Sustainability requirements, before they become business blockers. And third, every enterprise has a choice to make modernize with intent or fall behind by default, because in an AI era, infrastructure is strategy. The question is no longer whether you need to modernize. It's whether your infrastructure can keep up with your ambition. This episode of the AI Proving Ground podcast was co-produced by Naz Baker, cara Kuhn, mallory Schaffran and Stephanie Hammond. Our audio and video engineer is John Knobloch. My name is Brian Felt. We'll see you next time.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

WWT Research & Insights Artwork

WWT Research & Insights

World Wide Technology
WWT Partner Spotlight Artwork

WWT Partner Spotlight

World Wide Technology
WWT Experts Artwork

WWT Experts

World Wide Technology
Meet the Chief Artwork

Meet the Chief

World Wide Technology