.jpg)
The AI-First Business Podcast
🎙️ The AI-First Business Podcast 🤖 We take you behind the scenes with the leaders and teams writing the playbook on transitioning to an AI-First world
The AI-First Business Podcast
AI Unleashed: Wes Cummins on Supercomputing-as-a-Service and the Future of AI Infrastructure
Join us to dive into the world of AI infrastructure with Wes Cummins, CEO and Chairman of Applied Digital. Prior to founding the company he also founded and led 272 Capital LP and led B. Riley Asset Management as President.
Applied Digital (Nasdaq: APLD) designs, develops and operates next-generation datacenters across North America to provide digital infrastructure solutions to the rapidly growing high performance computing (HPC) industry.
Discover the distinctions between traditional and high-power density data centers, and learn how competition for electricity is shaping the industry.
Explore the diverse customer personas in the AI infrastructure market, from hyperscalers to Gen.AI startups, and gain insights into the future of compute for AI. Wes also shares valuable tips for companies venturing into GPU compute and finding the right partners.
- We Cover:
- Why do AI infrastructure demands necessitate high-power density data centers, and how do they differ from traditional ones?
- What's driving the escalating demand for compute power in the AI market, and how does it fuel competition for electricity?
- Who are the key customer personas shaping the AI infrastructure landscape, from hyperscalers to Gen.AI startups and enterprise giants?
- What are the diverse purchasing options for AI infrastructure, ranging from ownership to cloud services, and what choices exist for GPU compute?
- What are the common pitfalls to avoid when venturing into GPU compute, such as accessibility issues and hardware management challenges?
- How can you effectively locate GPU compute companies, whether through online searches or exploring smaller vendors alongside major cloud providers?
- What does the future hold for data centers as the gatekeepers of AI compute deployments, and why is there a growing focus on their power consumption?
Connect with Wes Cummins here:
Episode References:
- Nvidia’s H100 – What It Is, What It Does, and Why It Matters
- kW, MW and GW: How electricity units work and how to convert them
- The State of Digital Assets Data and Infrastructure: 2023 Edition Sponsored by Amberdata
- Nvidia H100s shipped 2023
- Hugging Face: The platform where the machine learning community collaborates on models, datasets, and applications
Useful? Let us know with a ⭐️ ⭐️⭐️⭐️⭐️ rating
📺 Watch on Youtube
📲 Socials
🎙️ The AI-First Business Podcast 🤖 We take you behind the scenes with the leaders and teams writing the playbook on transitioning to an AI-First world
Disclaimer: The opinions expressed in this episode are personal and do not reflect the official stance of any organization. Content is for informational purposes only.
Welcome to the AI First Business Podcast with Tina Yazdi, where we show you how teams, companies, and leaders are turning AI hype into ROI. Welcome, Wes. How are you doing today? I'm great. Thanks for having me, Tina. Yeah, thanks for joining. I'm gonna be honest with you. This is one of the episodes I'm most excited to record because this is giving us an insight to, um, your words, not mine, the 90 percent of the challenge with bringing AI to life. This year and in the coming years that even I didn't really know too much about. Just as a quick intro, Wes is currently the CEO and chairman of Applied Digital. And prior to that, he also founded 272 Capital and is an investment advisor focused on technology, hardware, software, and service companies. And it sounds like you sold that to B. Reilly Financial, is that correct? That's right. Nice. And Applied Digital is also listed on NASDAQ. And, uh, Wes. Not only is a veteran investor, but he knows deep information that I hope we can dig into today about what it's going to take for data infrastructure to support AI technology, some of the gaps and the magnitude of the gaps, and as well as a couple of other really important topics like the energy required to power all of this. Wes, did you want to say a few words about yourself and maybe apply digital? Before we dig in to question Sure. So you, you hit a lot of it. Uh, and appreciate that intro. So I've been career tech investor, so I've made mostly in what I call hard tech. So a lot of like semiconductor tech, hardware, some software. And back in late 2020, I had the idea around really from an investment angle on, you know, we got, we got our start in crypto. And the investment angle on crypto at that time was the way you could, you know, quote unquote, invest in crypto is do anything but Bitcoin and deploy a lot of GPUs, mostly in Ethereum. But there were other, you know, proof of work networks that you could kind of aim your. GPU compute at those networks and, and validate transactions, right? Be a minor on those networks. And so the idea was, you know, large scale deployment of GPUs. That was the original idea of the company. We had partnered with a company called Spark Pool out of China. They were at the time, the, the largest Ethereum pool in the world. They had about 25 percent of all the Ethereum hash going through their pool. That was really where we started. And, and, you know, we raised some money and this is the beginning of applied. And we raised some money. We bought, you know, a significant number of GPUs. They were all going to be deployed in China, in the Sichuan province in China. This, we did our round in April of 21. And right before we deployed those GPUs at the end of May, it was when China cracked down on crypto. So it was a complete crackdown. And so we were sitting there waiting, you know, kind of going, okay. What do we do? Our GPUs were sitting in a warehouse in Shenzhen, ready to ship. Do we ship them or, you know, because China had kind of done this before, or is this real this time or do we not? By the end of that happened on a Friday. By the end of the weekend, I realized we needed to ship those to the U S we did. We deployed them in a different facility, upstate New York. And, and what happened subsequently was we had the opportunity to build out Bitcoin. Data centers. And so we'd already started to assemble a team, but we assembled a team. There was a really, you know, shallow talent pool here in the, in the U S uh, because, you know, I think less than 5 percent of Bitcoin mining was going on in the U S at this time, but 70 plus percent was going on in China and those all needed to move somewhere. And then, so. I found a guy who was really good at power with a guy who had brought the first, you know, mining facility on in the U S and a company called mine code that turned into core scientific, but we assembled a team. We found a site, you know, through our power guy and we broke ground on that site in North Dakota in September of 21. And that site was operational by January of 22. So it was really fast, uh, about four months to get that going and then fully up a few months after that. And then since then we, you know, we built out a total of 500 megawatts over 24 months of these types of sites. And then about 18 months ago, a little longer than that at this point, you know, I was looking around at what else can be done with these sites. And we started to build a very. different style of high performance computing data center, very high power density. And this is what we'll spend a lot of time on today is this digital infrastructure. And so we started designing that we started building it in 22. We have that up, you know, in 23, some of the capacity running, we also put in place a Software pieces. We needed to do a cloud service that was in the fall of last year. So it's, uh, sorry, I'm getting confused now because we're just barely into the new year. So that was in the fall of 22. That was in the fall of 22. So we started building the data center in 22 and then. October of 22, we put a partnership in place on software to run a cloud service. And, and at the time, the only reason we did that was because we thought we would have to be our own first customer in our data center. Cause it was a really new style of high power density data center. And then kind of fast forward to where we are now. So December of 23 or 22, sorry, chat GPTs introduced, and then in. March of 23, the H100 was introduced by NVIDIA, it was the latest data center class, uh, highest power GPU they'd ever introduced to the market. At this point you had a lot of generative AI companies out there, and then ChatGPT was the big breakthrough on gen AI. We saw a rush of demand and the switch really flipped for us kind of mid April, late April of 23, where we started seeing all this demand and we're still out kind of shopping this data center capacity when we, we landed our first customer and character AI, a big gen AI customer that wanted thousands of H 100 GPUs deployed, but they just wanted a service. So we launched our cloud service and really where we've taken the company now is we're building it. Yeah. data centers, HPC infrastructure, uh, what I call next generation digital infrastructure. And we also offer a cloud service. Plus we have a large data center business on Bitcoin specifically. So we have all three of these things, but what has been the big transformation, you know, over the last year is the way digital infrastructure, you know, has to be built and we're. We just happened to kind of be building this before it was obvious of how it needed to be built. It was kind of more speculative for us. And I wasn't expecting, you know, the gen AI boom by any means, but we, we kind of got lucky in that, but that's the back, the background of the company, kind of how we came into being and, and, uh, the growth we've had. Um, but it's been a pretty crazy couple of years for us, but especially, you know, the last nine months, just what we've seen, the changes in the market, the requirements, you know, on, on the infrastructure side, and we can delve into, you know, the big requirements there, but it's, it's been a fun time. That's for sure. That's for sure. Yeah, it sounds like there's a lot of right place at the right time doing the right things, but it sounds like he also sees the opportunity as soon as you saw it, maybe earlier than others. Just to give some listeners a chance to catch up with some of these terms and some of these kind of like infrastructure definitions in your worlds, can we just set some ground level definitions? Can you maybe take a minute to walk? the layman through. What are some of the differences between traditional data centers, hyperscalers, high density energy? I think, sorry if I, if I misunderstood that. High power density. Yep. And what AI GPU cloud services is referring to. Um, and I also saw on Applied Digital's pages, this term supercomputing as a service, which seems like it's poised to take over what SaaS means to some people, at least. Can you just give like a really brief 101 on what? All these things are so we can continue the rest of the conversation with a bit of contact. So, you know, the last 20 plus years of data center build out have been really all it's been communications driven. I call it comms driven, right? It's been very comms driven, ultra low latency, high bandwidth, because think about the apps That have, you know, really developed over the last, you know, 20 years, really 10 years, it's all video driven, right? Almost everything we use, all the consumption of what we do, everything that's on your phone or on your TV at home, or you know, or us sitting here, you know, doing this, this podcast on video. All of it is very, very video driven. And so what video requires is it doesn't require a huge amount of processing power. It requires very low latency connectivity to the data center. So that, that means you need, you know, sub 10 milliseconds between where you're using it and the data center, call it round trip or sub 20 millisecond round trips. Because if latency was high, this experience would be bad for all of us, right? It's been very driven around latency and not around compute. And so that's how, you know, that you've had. And sorry, um, I'm just breaking this down just in case maybe someone doesn't even maybe doesn't use the term high latency, low latency, can you just define quickly, like, what's the difference between these two? So low latency means almost instantaneous. So when you speak in terms of latency, generally to be in data center regions, right, you know, so that's why there's a lot of data centers deployed just outside of New York City or outside of LA or. Dallas is a big region, right? Big population centers because you want the content very close to the end user. And so if you're using Netflix, for example, or you're streaming Tik TOK or whatever it is you're using on video, if you, if it's very slow, and let's think of it like buffering, and I'm, I'm old, old enough for this. And maybe a lot of people aren't, but when you started using the internet. You know, 15, 20 plus years ago, if you wanted to watch a video, for example, right? You there's a bandwidth and a latency issue, right? So bandwidth was a problem. And then latency becomes a problem when you have plenty of bandwidth. So when we used to all do dial up and you wanted to watch a video and you clicked on a video and then you went and made yourself something to eat, right. Or whatever, because it was going to be buffering for 20 minutes. Before it even played, depending on the length of the video, it might be much longer than that. So, you know, what the world has evolved into is high bandwidth, last mile high bandwidth, but then the data centers, you need low latency because if, you know, if you and I were talking on this video app, normally tell people zooms, right? This isn't zoom, but it's something similar. There was a huge amount of latency. It would take you time. Like, like if you look for like TV news reporters, right? When you, when someone's coming in by satellite and they talk to someone and it takes like. Yeah, it's like 10 seconds for them to hear it and then it goes back. So that's the, the latency aspect and like kind of the world we live in now, like latency is just not tolerable by anyone. You know, it used to be, like I said, you would, you would click and you would buffer and you'd go make something to eat and you'd come back to maybe watch your video and now like you, you know, people are throwing the remote if the, if the Netflix doesn't start in like two seconds after they click the button. So we've built our world around low latency and that's all the video streaming apps that we have and mission critical apps. That would that we need as well, uh, need to have ultra low latency and very high up times. Um, and so that's the world that's been built. And so you have data centers that go for enterprise, and then you go all the way to hyperscaler. But the vast majority of data centers that are built for this application. are configured in what we would call low power density. So you have the latency aspect is one, but then they're configured in low power density because they don't require a lot of computing. So most of these data centers deliver and if you've, you know, people imagine a data center, right? It's just rows of compute and they're stacked, you know, it's blade servers stacked on top of each other. Uh, and you might get 20, 30, 40 of these in a single rack of, of computing, right? Network switch on the top and. You know, an entire rack of servers that in the traditional data centers, they deliver around 7. 5 kilowatts of power to that rack of servers. And typically that rack of servers or cabinet of servers will only use about. Five kilowatts of the power that is needed, maybe even less than that. So that's a traditional data center, seven kilowatts of power for an entire rack of, of computer web servers. And then when you go to AI with the GPU servers to just put that in comparison, a single NVIDIA H 100 server. That has, you know, eight cards in the server, and you know, HGX, DGX, which is the InfiniBand configuration, a single server takes 10. 2 kilowatts. Compare that to an entire rack using, you know, 5 kilowatts in a traditional data center. And not only does it, does the single server need that, but you need multiple servers in the same rack. You need them all close together. So this is the high power density, right? Uh, so you need it very dense. You need them all close together. And the reason you need them close together is because you have a special type of networking, optical networking called InfiniBand, and you can also do it in an Ethernet configuration called Rocky, but most of our clients are looking for InfiniBand. And then kind of the magic number on InfiniBand to put all of these GPUs on the The same cluster is 30 m apart. They can be 30 m apart. Sorry, they need to be, or they should maximum be maximum. They need to be, I should have said, less than 30 m apart. That's your max. For these to work, back to the super computing as a service So these work as supercomputers, if they're all interconnected with InfiniBand, and then they're extremely powerful compute that is needed for model training. And in some instances for inference, but really for, for training AI models, these, you know, the chat GPT, when you talk about GPT, when you do the model training, so came out with a GPT three, you know, and then we went three, five and then four and then four or five. So the training of the model is what gets it. To the next level, right? It makes it makes it better training, more data into the model. And so you need these close together. Traditional data centers can't even handle a single NVIDIA H 100 server in their entire rack, the way they're configured right now, much less can they put, you know, four or five, six of these in Iraq and then put many of the racks close together. So you have this high power density issue. It's becoming really apparent in the market and it's going to become even more so in 2024. And so that's what we're doing. We're building very high power density data. that are built specifically for these types of workloads. Now, the thing for most of these, especially on the training side, they do not require low latency. It can. They're, they're much more tolerant to higher latency because it's just moving large amounts of data inside the data center and then the GPUs will crunch to train this model for days, weeks, months, and then moving the trained model, a big piece of big file back out, and then placing it typically in a lower latency setting for inference and inferences. You train the model and then you upload the model and then inference is when you use chat GPT, it's, it's doing inference, right? And it's interacting with you and it's, it's spitting out the answers that you want. And that's a lot of, a lot of the different models. So you don't need as low latency, especially for the training aspect, but you do need extremely high power density. And we need a lot of it, right? If, if you look at the NVIDIA forecasts and the, and the forecasts in the market, or if you're going to ship, you know, over the next 12 or 18 months. You know, two million H100 class GPUs, you need, you know, over four. Gigawatts of power, just an IT load, which, you know, translates into six gigawatts of, of roughly total power just to bring those online. And that's where we're getting to the, you know, this being the bottleneck, that's kind of the one on one is you're, you're really moving from traditional data centers, which that, that market, by the way, is still growing, right? It's, it's still a growth market, so this isn't a replacement. It's in addition to, you need this high power density. configuration, and it's hard to retrofit, by the way, it's hard to retrofit current data centers. It's better, in my opinion, to do Greenfield, uh, new builds. You said that the current low latency data centers that support things like video can't support this. One is the retrofitting piece. To clarify, is this because they, one, don't have space, and two, they just don't have the energy load capabilities to power this? Or it sounds like maybe it's a bit of both. They would have to reconfigure two things. They'd have to reconfigure the power density. And typically in most of these, you can't just snap your fingers and get more power into the data center. So if you had a data center that's commissioned for 50 megawatts and it's sitting in, you know, downtown Dallas, or not in downtown, but in the Dallas metro, and You all of a sudden want to reconfigure that with the new power density. You're typically reconfiguring to like a 10 to one on the power density. And if you can't bring more power into the building, you're going to have 90 percent of your floor space open because you don't have the power, even though you want to reconfigure. So you need to do that and getting the additional powers is a big issue. But then on top of that. Cooling becomes an issue. So when you use a lot of electricity and computing uses a lot of electricity, it creates a lot of heat. And so these typically aren't built for the type of cooling you would need for these high power density workloads. So it makes it much harder. Just next up, I had a few additional, I'd love to dig into, um, kind of an overview of the market and where it's headed. And as part of that, dig into like the situation with chip supply, energy and geographical limitations. And then maybe after that, we can transition a little bit more to the supercomputing as a service and onboarding customers that maybe don't have as much context as some of the customers that you're already working with already ahead of the game. So kind of like digging back into the, uh, thank you for setting the scene on the transition from traditional to the type of, um, infrastructure AI needs to compute the things that it computes. How are you seeing the marketplace in data centers, whether it's existing data centers trying to like transition? To capture this demand or like new ones entering the market. Is this an easy market to break into? If you have the resources, is there a lot of competition or is it more like anyone who can get into it, like can get into it right now, just cause there's such high demand. It takes a lot of expertise in this market and I've become an expert, but what I have done is surround myself with people who know exactly what they're doing in this market. So it, it, it takes several things. You need to select, you need to find the power. Right on the data center, you need to select the right site and needs to have a lot of different, you know, things at that site to make it a viable option. So you need to have the abundance of powers available to location is kind of number one for us. But then you need to check a lot of other boxes at that site. It can't just be anywhere. One of the biggest boxes to check is fiber connectivity to the site, which is a bigger issue around we have Bitcoin data center business. And now we're building. Oh, that's still running at the moment. And a high performance compute business, but not all of our Bitcoin data center sites work for HPC sites. So our sites in North Dakota both work, but our one in Texas doesn't work because it doesn't have the fiber infrastructure that is necessary nearby. You don't just need fiber, you need redundant fiber, meaning multiple different providers. So you have to find this, then you have to, you know, design a facility. We have a kind of our own design that we've been working on for the last couple of years with refinements. So you need to know how to design the facility and then you need to be able to build the facility and then you need the capital that goes into it because these are very expensive. So we're in the middle of what I, you know, it's kind of been my, like, my vision is the facility we're building in Ellendale, North Dakota. Now it's going to be a hundred megawatts, same building. And it is what I call the AI brain data center. It's configured and built in such a way, or will be built in such a way, but designed in a way that you could actually put all of the GPUs inside of the building on the same. Compute cluster. So you could get roughly 100 class GPUs on the same cluster. So that would be, you know, far and away the largest supercomputer in the world. If you did that, unlikely that we'll have a customer that wants all of them in the same cluster, but it's designed that way. And it's done multi levels. It's not an easy business to break into. There are other people doing it right. There's there's people been building data centers has been a great business for a very long time. You have, you know, equinix is the is the public company that's the leader in traditional data center. I haven't really seen them come into this market. I'm sure that they have plans and they'll be trying to come into this market. It's just such a big growth market that I think a lot of people will. So there's a lot of people trying to get into it. The biggest gating factor right now is availability of power and availability of power that will come online over the next 24 months. And so. You know, we have 400 megawatts of power that can come online over the next 24 months. We're sitting in a really good place for that. But that's the biggest issue right now is the power availability in the near term. Yeah. Um, actually let's maybe transition it into energy a bit deeper. Um, cause in the last really interesting, and I learned a lot in our last conversation point that you made is the competition of electricity, which has already. kind of limited in excess availability in many geographies, um, but the competition against electrification of vehicles, hydrogen manufacturing, et cetera. Can you maybe talk a little bit about how big of a leap in gigawatts you foresee in the next like two to three years? And in addition to that, like how do, how do these other changes in society and infrastructure outside of computing, um, impact that? So let's, let's just talk about the AI infrastructure. I talked a little bit about. You know, the power consumption, and then I said, just the Nvidia shipments, right, is going to need over four gigawatts of I. T. capacity. Then when I say four gigawatts of I. T. capacity, it's more like four and a half. Then you need to multiply that by typically about 1. 35 because there's additional electricity needed. For cooling and for mechanical inside the data center, there's the it load and then there's everything else that supports it. Um, so you're looking, you know, for the next, I would say 12 to 18 months, 6 gigawatts of power that needs to come online, mostly just for NVIDIA. Right? I'm not, I'm not counting like AMD or Intel or or any of the other players in the market. Sorry, if this is a really silly question, when you say just for NVIDIA, is that for them to produce the chips? I mean, support them coming online, support them coming online. Um, and this is like for all the deployed NVIDIA chips, like the 550, 000 that were deployed. What I'm really looking forward is what, what is NVIDIA expected to ship for the next, you know, 12 to 18 months. And when I look at that, that's what I'm talking about with that, that additional kind of six gigawatts of power. And, and so let's put that in perspective of the, you know, the North American data center market is roughly 22 gigawatts. So that's a big leap in a single year. And it likely doesn't stop, you know, I've seen forecasts as much, this is for the entire data infrastructure. Correct. In the whole of the United States, 22 and we're, I'm talking about another six that, you know, in 12 to 18 months. And then through 2030, I've seen forecasts as high as, you know, close to 40 gigawatts of, of power that needs to be added in data center capacity. It's a big number, right? We're in 24, that's six years away. Um, So even, even if it's 30, right, you're trying to recreate what was created in, you know, the last 20 years in seven or six years. So, or even, even do more than that. And you'll continue to build for those traditional applications as well. Those aren't going away. This does, again, does not. That does not replace that. Uh, so, so you're, you know, it's, it's a massive amount of power that needs to come online over, you know, the next five or six years. But the, the, I, I think the biggest challenge is the next 24 months because this whole thing caught, you know, I would say the data center industry kind of really off guard, and I had a lot of conversations, especially on our cloud product, because we do use some third party providers where people are just trying to figure out what's going on, well, what's going on is, you know, people think about. I think most people think about AI in many ways being like software 3. 0, and it's just in one small way it is, but it's just not, it's very different in that when you look at these companies, the ones that we're servicing, you know, they have millions and millions of users, and they have, you know, sub 50 employees typically, right? It's not an army of Programmers, it's all compute driven. And so in, you know, historically when you did software, you'd have a lot of programmers and then you'd load it up on really standard hardware and it didn't consume a lot of power and it would run, you know, whatever application you're looking for. And now on the AI side, it's really compute driven. It requires the infrastructure. to drive the AI. It's very, you know, programmer light, I would call it and very compute intensive. And so it's just a very different dynamic in the market. But so you have this AI build out that everyone's, you know, rushing to get this stuff online because it's an arms race in many ways right now. You know, whoever has the best model gets the most users, gets data, retrains the model that you get an advantage. It's, it's like the Tesla advantage on FSD, um, so, so they get more data, right? I'm sorry if this is really a stupid question, but what is, uh, FSD? Full self driving. So full self driving, but they, they get all the data feedback from all the miles driven by Tesla drivers. And then they, you know. I think Tesla will be one of the biggest users of NVIDIA or that type of gear, you know, training an FSD model. And so they'll take all those data points, but they can get better and better and better because they have more cars on the road getting those data points. And so it's, it's something similar with the AI models, right? The more you gather, the more users you get. And so it's this arms race to get the model to market. So you want a lot of H100 capacity, which is how you get there. It's not, it's not programmers, but you know, you have a rush to put this online. They, you're going to need a lot over 24 months, you're going to need a huge amount over six years, over 10 years, you know, if this, and I strongly believe that this is kind of the next wave in technology, you're going to need a lot of compute and a lot of data center power, then you're competing with the electrification of vehicles. We talked about, you know, hydrogen on green hydrogen requires a huge amount of electricity to generate the hydrogen. And so you, you have to figure out, you know, where to go, how to procure the power and you need to be somewhat creative in how you do that. And we do that really well as a company, you do have some things that help you, like we have some things that help us in this industry, which is non latency. It's the fact that it's not latency sensitive. Right. Some of the applications here, you can go to more unique locations for data center capacity, but there's a huge competition in the market right now for power. One of the largest power providers in the U S we speak with them. We speak with everyone. We have, like I said, a great power team that finds power back in October. When we were speaking with them, they said that they had had in the last eight weeks, something like 12 requests for between 500 megawatts and a gigawatt of power for data center applications in the last eight weeks. That's a massive, massive number. That's a huge leap. Just before we move too far from a point you made a couple of minutes ago, which is that you observed a lot of people think that AI is simply like software 3. 0. It's so much more than that. I was recently listening to the 2024 predictions by Clem Delong, Delang, I hope I'm pronouncing that right, who's the co founder and CEO of Hugging Face. And one of his points was that there's a lot of entrants to competing in this kind of arms race that you just referred to. But the prediction for 2024 is a lot of them are going to absolutely fail. to get to profitability, um, which probably doesn't surprise you. Can you talk a little bit more about, like, what proportion of that kind of P& L is related to this energy usage and the infrastructure that maybe someone in their bedroom kind of, like, building a little app may not have accounted for, resulting in that profitability game? being a little bit triggered than anticipated. It's a huge cost for the Gen AI companies. It's, it's the number one cost, right? The compute is by far the number one cost. Just to put in perspective, if you come to us and I feel like we're a low cost provider on our cloud service, then you want a thousand, they actually go in multiples of eight because of the server configuration. So we deploy clusters that have 1, 024. H100 GPUs in them. Now you can break that down into like 256, 512, but you know, generally people will want in that range, if you're, if you're doing a language model training or you're training your own model on top of an open source model, depending on your speed to market, you're going to want that type of training. And so a cluster from us. Of 1024 GPUs on an annual basis will cost you about 20 million. We have customers that, you know, there's demand up to, you know, 10, 000 plus of these. And we have a lot of customers that have kind of the one to 5, 000 type, uh, deployment demands. They want them all in the same cluster, by the way, because they're much more efficient that way. That's kind of the cost. Of what you would think if you need, you know, 1024 GPU is going to cost you 20 million bucks a year. If you're doing model training, that's kind of the scale that you need to get to, I think, to compete effectively in the market. So it is a big expense. Now, the one thing that is good on the flip side is, as I mentioned, for a lot of the customers that we have, you don't have, you're not paying a lot of expensive software developers to develop software for you, right? So a software dev might fully loaded, cost you a couple of hundred thousand bucks a year. Uh, you know, you can outsource that, you know, to, to lower cost, at least you can outsource that to lower cost locations, but you know, you, you could be hit and miss on what you get out of that. So you are replacing kind of one cost for another. The issue right now, and I think this will get better over time is I think the training side is okay for that, right? Where, you know, think of training as your research and development. You know, line item and so training similar to, you know, older software companies, they would pay a lot of money to develop that software. And so you, you're paying maybe something similar on the training side. Now the issue. More so is when you move into, let's call it production, right? And in production where you're using inference, inference is still very expensive as well right now. I think inference will get cheaper over time, but that's the big. What would account for it getting cheaper over time? I think there's going to be a lot of inference solutions in the market, right? I don't know that in many cases that it uses, especially if you're doing like LLMs right on the, on the large language models, you know, I think you'll be easier. You'll get. You know, lower cost solutions on the inferencing side. Now, when you're thinking of video or image gen, like you're probably going to stick with pretty high cost, but people will pay for that. You know, software coding is probably going to stick with higher costs on the inference side, but if you, you know, you're going back to. On the software coding side, for example, you, so you did your R and D to produce the co pilot that helps on software like the, uh, get hub co pilot, but then someone else is using that co pilot, even though they're using inference, they're using it for their own R and D. So it's another kind of R and D stack. And then image gen is similar, right? Even though you spent a lot of money to make your image model or your video model, the people who are using that are making something that they'll. You know, either be able to use many times or resell or whatever it might be. So I think that's fine. But I do think inference costs will come down, especially like on the LLM side over time, because it's hard. It would be hard as a LLM company right now. Inference is about 20 or 25 percent cheaper than doing training just because you don't need the same type of gear from a networking perspective. But I think that cost needs to come down because think about your LLM goes into production. And if your cost. Is the same as your development cost ongoing. It's going to be really hard to make a business model out of that. So I do think the inference side will come down, but for people getting into it, it is expensive on the development side from getting GPU capacity. It is expensive. However, you're replacing a lot of people you would have normally had to hire, and you're probably going to get a much more predictable result out of it using the compute. This kind of leads me to a question about customer personas based on our previous conversation. It sounds like there was a really clear early wave. Maybe some of those first customers that you worked with, like character AI that understood the need for this infrastructure, the, they moved early to invest and set it up. And then there's maybe like the more like mainstream type of customers. These are maybe big tech companies and maybe some late adopters. Do you have a perspective on like, what are the personas of these three categories? Where are we today? Like, are we still in the early wave of companies like building up and investing in this early infrastructure? Has this been happening for a while behind the scenes? And what kind of maybe like companies or organizations should be thinking about this already? Like, what kind would you maybe advise should start budgeting for maybe five to ten years out? Like, do you have a general perspective on that? Let me talk a little bit about how we've seen the market develop. So it's, you know, the market, I like to refer to it as been a barbell type of market, right? It's on, and the barbell is. Two big pieces and then nothing in the middle. And so it's been a little bit of a, what I refer to as the barbell, which is, it's been hyperscalers, you know, really. So you, you, everyone knows that, you know, meta has been big in this. Google's been big. Uh, Microsoft's been big, mostly through open AI. Amazon's in there, you know, Apple's getting in the mix now. Those are, you hit the hyperscalers for the most part, the five that most people refer to. So. You've had that group, which has been pretty early in the market. And then you've had the gen AI startups, the generative AI startups. And so open AI, obviously, you know, the most prominent of that, and then you have, you know, our customers in character, which is a language model. They have a very cool app that you can pull from the app store and check out. And then, you know, you've heard names like Anthropic. And you've heard names like mid journey, which is image generators. You've heard names like runway, which is video. And there's big group of these gen AI startups I call foundational models. Adapt is a, is a software model. And the, so you have all of these. Gen AI startups. And then on the other side, you have the hyperscalers. And so those have been the two that have been super active in the market. And we've found most of our customers on our cloud service in the Gen AI startups. And now what I'm seeing come into the market is what I call enterprise. So this is companies that have their own business. And the enterprise market is going to be a massive part of this market. So it's, it's been exciting to see that Startup, and this can range anywhere from software companies or services companies or consumer products. Companies could end up, you know, there's going to be at some point, you know, airlines will be in this hotels will be in this, and you, you just think of all the applications that they're going to have, but what we're seeing are those types of companies more on the software side right now, because I think all the software companies have been thinking about, you know, what is their AI strategy going to be? For their customers, right? They need to offer a product for their customers, you know, whether that's a copilot or something along those lines. And so we're starting to see them come into the market, not just trying to figure out what they're doing, but clearly they've been trying to figure out what they're doing, but now they're coming into the market to get large amounts of GPU compute. And that's where we're running into them. So that's pretty exciting to see that start to come into the market. And that's going to be a massive part. But, but the things I was talking about, you know, if you're a consumer facing company, do you need to start thinking about your, your help desk, right? It needs to be a gen AI chat bot. That's going to be a really easy one. And there's companies out there that are already, you know, doing white label chat bots for other companies. But I think you're going to see a lot of growth there. If you're an enterprise facing customer, you're still. You're going to need that even inside your own organization. You think about, you know, when I just stick on the kind of the chatbot helpdesk, you know, maybe an IT helpdesk that is an AI IT helpdesk inside of large companies, or maybe an IT helpdesk for outsourced IT, right. If you're dealing with small companies and then your every big first group that was on this inside of every organization has been marketing, right. Marketing was right on top of chat GPT. Like you talk to anyone, you know, really. Maybe not anyone, but most people who do marketing, this has been an absolute fantastic new tool for them for creating marketing materials. You're going to want to take, for larger enterprises, you're going to want to take some of that in house. You know, there's a lot of companies that have restricted their employees from access to ChatGPT because they don't want company info going into ChatGPT. So they're going to train their own LLM that they'll license from, you know, one of the large LLM providers, one of the foundational model providers. They'll train their own data into it. But these are the type of companies that I'm excited to see start to come into the market because they're established, you know, they have a business, they make money because there is going to be pretty significant graveyard at some point. The startups, just like every time we've ran through that, you know, in any new cycle of any new technology, there always ends up being, you know, some super, super successful companies, some moderately successful companies, and then, you know, a graveyard of the rest. Can you talk us a little bit through the differences between purchasing a cluster directly from Applied Digital and your supercompute as a service? Like what kind of service is appropriate for what type of use cases? Yeah, so there's, that's a great question. There's several different ways for people to go about this. Do they want to own their own compute? So do you want to go procure the GPUs? You want to, you know, design? the deployment, you, you know, design your cluster, get all the pieces appropriate that you need for that, then go procure data center space and put that up. And now there's a lot of people who are going to do that. A lot of companies that are going to do that, or would that be the barbell kind of, it could, it could be, but on the gen AI startup guys, you know, they've been. Using us on a, on, you know, I wouldn't say an on demand service, but a cloud service. And then on the hyperscale side, they've been, you know, building it all themselves. The enterprise, you know, that'll be a mix. It'll be a mix of cloud like we offer or private cloud is really one of the big things that we offer. And then. There will be on prem that they'll have as well. I'm sure of that. And so there's going to be a mix, but if you want to go and procure your own hardware and run it and, you know, keep it the maintenance and operate it, you can do that. But for smaller users, you're going to be better off using a service like ours, which is, you know, a turnkey service right now. Most of our service is dedicated, right? So we get a large contract of someone who. Reserves all the capacity on those GPUs for, you know, 24 hours a day, 365, typically for a two year contract, we do have an on demand solution. We do that with partners. So you just pay by the hour that you use. And so for smaller companies, that's absolutely the way for them to go. Now it's more expensive. If you see your needs growing, you need to, you're going to want to move into. Still a provider like us, but use a reserve contract that gets a discount because it's a guarantee. So you're going to see this, the mix of this all different flavors, but the vast majority of people, you know, if you're, you're listening to the, this podcast and looking at doing something yourself, even if you're in a medium sized organization, you're going to start with some on demand. We offer that, you know, AWS offers that Azure offers that Lambda labs offers that. So that's part of the market. You know, as you, if you see your compute needs growing, you're going to have to move to either the reserve compute model, which is also with us or procuring your own hardware and data center space. And again, I'm a layman, so sorry if this is a silly question, but when you say procure your own hardware, does that mean they're basically leasing or even owning a piece of data? I don't know how, what is that exactly? You're going to purchase servers, uh, typically from, from someone like HP or Dell or Super Micro. They're going to put in, you know, the NVIDIA cards in there. Oh, like literally build your own actual infrastructure. Build your own infrastructure. I'm sorry, I understand. Thank you for kind of like breaking down the on demand versus reserve demand definitions. Are there some common mistakes or gaps that you've been encountering, maybe particularly in the last six months when companies are exploring this? And if so, do you have any advice that companies should maybe be thinking about in their exploration of what the right choice is for them? I think for, you know, gaining access to compute has been a gating factor for a lot of companies that we deal with. For customers that are looking or companies that are looking, the pitfalls that we have seen is just access to compute. It's specific. I don't have a view of every problem that companies have, right? It's the, it's generally the ones that we're solving. So access to compute has been kind of the biggest one that we have seen over the last, you know, nine months. that we've really been doing this. I think for a lot of companies knowing, and I'm not just trying to plug our business, but knowing that we are here and there's other smaller vendors out there that are there, and it doesn't just have to be AWS or Azure and, you know, to, to get access to the compute, but I would, I would highly recommend using, even if it is the cloud providers, the architecture of these clusters and running this equipment is very difficult. So, you know, people trying to. Go at it, you know, like the, uh, medium sized enterprise that has managed their own IT before. This is different. It's a very different set of hardware, very different set of requirements. And then would highly recommend as you move into this to use provider like ourselves, or even one of the hyper scalers, because it's just different. And you'll find out. With very expensive equipment, it's hard to manage. On that note, can you share a little bit more about how could customers connect and find companies like yourselves, whether specifically Applied Digital? Like, I personally never come across your company before we had, you know, our first connection. Where do you look? How do you connect? The Google search works. It doesn't, you won't find us if you're looking for GPU compute. And the, and the reason is. We've, until recently, we've never had a single salesperson inside the company. We hired our first back in November. Is it, can you share why, is there a reason that was the case? So we haven't needed it almost since inception of the company. We've had more demand than supply of what we provide. That still remains the case to this day by a. A fairly wide margin. The reason we added a salesperson is because we, you know, I started to see this enterprise market develop and we want to go really aggressively after that, that enterprise market. And I think it, you know, having us out there and marketing ourselves, finally, this is the appropriate time, but you won't see us, you know, we don't advertise on Google. All of our capacity is generally spoken for, but you can Google, you'll, you know, you'll find some of these other providers, like I said, like a Lambda labs or a core weave that are the ones that I would count as competitors to us on the cloud side. And then you, you know, run across AWS or Azure, but you, if you, if you Google, you will find some of the smaller providers out there. Okay, got it. Um, just for the last few minutes, are there any kind of like predictions or final words of advice that you would like to end on? My biggest prediction for 24 is that just going back to data center space and people don't think about it. No one ever thinks about it. Right. I know like, you know, I have kids and I, uh, A lot of people that I interact with, they, you know, have no idea how things show up on their phone, right? They, it just does. So, you know, everything that you use, there's the data center somewhere nearby running the, the compute workload for you. But data center space, which is, you know, historically a very pretty boring market is going to be. The bottleneck for AI deployments this year, you know, I think there was a big bottleneck in NVIDIA GPU deployments, uh, last year and that, you know, it's still tight in that market and especially on the InfiniBand portion of it, but the data center capacity to turn those on and just leave it like this. So fixing additional wafer throughput or more advanced packaging capacity, which were, you know, the bottlenecks, high bandwidth memory, these were the bottlenecks on the equipment shipments last year from NVIDIA. So fixing those to me is much easier than fixing, finding a large amount of power, securing that power, getting delivery of that power, getting permits to build, building large buildings that are fairly complex, you know, that's just going to take a longer amount of time. Then fixing, you know, the, the supply chain issues on, on GPUs. And so I think as we get into 24, I think we're already starting to see that issue, but I think that's going to start to pop up more and more and more as a primary issue on the deployments of, of, uh, compute for AI. And then you're going to, I think you'll start to see a lot of articles around, you know, the power consumption of, of AI, right? You've probably seen some of that already, but you're going to see a lot of that in 24 because you know, that's really going to start to ramp up in 24 so for us, we. All three of our sites were co located with wind farms. We find stranded power generally off of, uh, renewable sources. Primary, all, all of them right now wind. And stranded power just means that it's, it's power that can't be deployed to any. There's not the infrastructure in place to carry that power from where it's generated to an end use. And so these exist, right? And a lot of it exists in the U. S. either because a business went out of business, right? There's, there've been locations where, for example, there was a big aluminum smelting facility that went out of business. And so there was a big power infrastructure there for that. And now it's gone. You know, it can't be sent anywhere else, but for us it's all been renewable, you know, deployments, renewable developments in primarily in North Dakota, the sixth largest wind state in the nation and the 49th largest population state, I believe are the numbers. They produce double, over double the amount of power they consume in the state. And a lot of that can't be transported out. And so in a lot of instances when the wind is blowing and the power usage is low, you know, wind farms get curtailed. So because they run, it runs too negative on the grid. And so we, we kind of. come in and balance that out and find that power. But you're going to start to see more and more talk about, you know, the power consumption of data centers. I mean, that's been a big issue, you know, for a while already, but it's going to become a very growing issue. But I think my number one prediction is just, you know, I think people will be surprised that data centers become the gating factor of deployments of, of compute for AI. Great. Um, Wes, thank you so much for joining us and sharing your expertise and your perspective as well as for your predictions and I wish you a wonderful day and hopefully we'll chat again sometime. Tina. Thanks for having me. I really appreciate it. Thanks for listening. If you've gained value from this episode, please drop us a five star rating on Spotify or a like on YouTube. See you for the next one.