What the futr

What the futr is a biweekly podcast that explores the intersection of AI, sales, and humanity. Hosted by Sandesh Patel and Chris Brandt, each episode features AI startup founders and tech leaders sharing real stories, their value proposition, and visions for the future—structured like a smart first-call sales meeting. It’s all about making AI make sense for businesses—and helping people stay informed, not left behind.

All Episodes

What the futr

Delivering Scalable AI Infrastructure: Alex Yeh, Founder & CEO GMI Cloud – Ep 3

July 15, 2025 • Sandesh & Chris | futr connect • Episode 3

0:00 | 44:53

Welcome to Episode-3 of What the futr.

In this episode, hosts Sandesh Patel and Chris Brandt sit down with Alex Yeh, Founder & CEO of GMI Cloud, to explore how he’s building the "Shopify of AI".

Alex takes us through his incredible journey from mining Bitcoin in China to setting up data centers in the U.S., and to now powering the next generation of AI builders with GMI Cloud. Learn the origin story of GMI Cloud, the pain points it addresses, its ideal customers, its competitive edge in the AI cloud space and its vision for a more accessible, developer-first AI infrastructure.If you're an AI enthusiast or building in the AI space, this episode is packed with insights you won’t want to miss.

Subscribe for regular episodes.

👉 Explore the platform: https://futrconnect.io

👉 Business inquiries: sandesh@futrconnect.io

00:00:00:00 - 00:00:23:08
Sandesh
Thank you for tuning in to another What the futr podcast. Today we have Alex Yeh, CEO and founder of GMI cCoud. GMI Cloud describes themselves as “the Shopify of AI”. You don't completely understand what that is. Stay tuned and Alex will explain. But in the meantime, what I took away from this podcast was the complexity of the entire AI stack.

00:00:23:08 - 00:00:44:10
Sandesh
It's not just the hardware. It's also the software, but it's also the people and the intelligence that we need to make the entire stack run. We got to ChatGPT, we press a button and we get these results. But what's going on behind the scenes is absolutely complicated and exciting. And that is what we dig into today. Thank you for tuning in.

00:00:44:11 - 00:00:49:03
Sandesh
Hope you enjoy the show.

00:00:49:05 - 00:00:59:02
Sandesh
We've got to change the game.

00:00:59:04 - 00:01:23:07
Sandesh
One of the biggest challenges with AI is access to GPUs and not just the GPUs, but the entire stack. So you can actually do smart stuff with your GPUs. So today it's an honor to have Alex Yeh, the CEO and founder of GMI Cloud. They're disrupting that space of GPU data centers. Thank you so much, Alex, for joining us today.

00:01:23:09 - 00:01:24:14
Alex Yeh
Thank you for having me.

00:01:24:15 - 00:01:44:15
Sandesh
You know, Alex, the first time Chris and I talked to you, I have to say that was a fun conversation. And I wish I wish it didn't end, but I know that you had to leave us. But this conversation around GPU data centers is just so fun. It's, it's a huge market. You guys are doing some great things.

00:01:44:17 - 00:01:54:11
Sandesh
Can you tell me a little bit? Can you tell us again how how how did Alex get here? Can you tell us a little more from the beginning to what brought you here today?

00:01:54:13 - 00:02:21:11
Alex Yeh
Yeah. So I guess I have to start with GM's name. It stands for General Machine Intelligence Cloud. So it has that, IBM vibe. Okay. Or, kind of people call us, “They Gonna Make it Cloud”. Yeah. Yeah. Yeah. So I started this, in about 2019 when I dabbled into kind of the edge partnership process.

00:02:21:13 - 00:02:44:23
Alex Yeh
So I graduated from Hopkins, with that material essentially degree. And then, I later become, one of the youngest, the youngest, partner of, headline title, which is s, headline is a is a SF based, venture capital firm. I became the youngest partner. And started the entire crypto journey there.

00:02:45:01 - 00:03:04:16
Alex Yeh
And during which I was, I invested in, bitcoin mining, which my high school friend kind of have a have a data center in Inner Mongolia, China. So I was just an investor. Right. So I put it, I bought some, he helped me, bought some servers, put it in his, data center and called it a and put the on my wall as well.

00:03:04:21 - 00:03:24:08
Alex Yeh
You know Bitcoin comes into play. So, that thing lasted like two years and such. Basically, like, so far, I'm not doing much, but unfortunately, she, from China wasn't so fond of Bitcoin all that much. So he kind of shut that down. And then one day, now I'm thinking, you know, the, the whole tariff.

00:03:24:10 - 00:03:45:07
Alex Yeh
At least Trump gave us 90 days. China gave us one day. It wasn't even notice. I saw it on your chart, which is the that kind of the Facebook. China. I really saw saw I'm like, I'm just social media. I was like, oh, yeah, Bitcoin miners is is shutting down today. If actually I check my wall I was like, it's going to zero, right?

00:03:45:08 - 00:04:07:15
Alex Yeh
Okay. So so I was like, okay, I either write down my investments or I can find somewhere else. Other your home. So I chose the latter. And so I flew to us, one way ticket to New York. Didn't have much connections aside from, you know, I spent four years in college there. So, I was like, okay, how hard could it be?

00:04:07:17 - 00:04:36:18
Alex Yeh
Right. Just finding new data center. Putting servers over. But about 80% of the capacity was coming from China. At the time, everyone was flying out. Everyone went to, like, this is pre-war. So everyone got like, Ukraine, Russia. Kazakhstan. Like, somewhere in weird, the Southeast Asian countries and obviously US, Canada was huge. And so no one has capacity and no one wants to talk to a amateur bitcoin miner because my skills too small.

00:04:36:20 - 00:05:00:04
Alex Yeh
And so I did another thing which is like okay how hard could it be. So I I decided to build it myself. It was a, it was a stressful kind of six months to a year process. I scoured Earth. I looked at Google Maps every single day, pick up my phone call. You know, you know, my phone book and start dialing, looking for a local chamber of commerce.

00:05:00:04 - 00:05:22:05
Alex Yeh
And my line would be, hey, my name is Alex. I want to buy power. I want to buy land. I want to build a data center. So that thing, you know. So fast forward to 2024. I was able to build three data centers, in, across America, in Arkansas and Texas, and even though, we pivoted to the cloud space, which I'll get to.

00:05:22:07 - 00:05:47:04
Alex Yeh
Currently, we're still the largest bitcoin miner of Arkansas amplified pivot. And, because I'm in Salem Valley and, about a year ago, a lot of other friends and my VCs that backed me and supported me say, hey, Alex, my portfolio company by founders, friends like, hey, I'm doing AI training and I need some GPUs. You must have a lot of GPUs because you do with Bitcoin mining and servers I do.

00:05:47:06 - 00:06:14:11
Alex Yeh
So that's where the kind of the pivots begin. And initially it was just getting people GPUs access. Right. And then it has it kind of evolved into something that's more profound and complicated that. Right. People don't just need the GPUs access, but also the entire ML stack. And now now, going into kind of the API access so that people don't want to don't even want to interact with GPUs anymore.

00:06:14:11 - 00:06:26:02
Alex Yeh
They just want to have a simple API for them to play with. So that's kind of the whole journey, of how I started and made mining and then pivot to cloud. Till today.

00:06:26:04 - 00:06:44:03
Sandesh
So your team describes you guys as we are the Shopify of AI. So, yeah, I love I love that tagline. So can you tell us a little bit about the company, the problem you are solving and what you think the opportune entity is?

00:06:44:05 - 00:07:15:12
Alex Yeh
Yeah. That's right. I think, there's there's a new term, right, called neo clouds. Basically these new new, suite of, cloud service provider that's not, you know, the GCP of the, and everyone tries to be the new, AWS. Right. But working on and the, very different path is, I first of all, I think I've never heard of anyone say, hey, my cloud bill is actually pretty low, right?

00:07:15:15 - 00:07:55:02
Alex Yeh
Never. No CTO will tell you that, and people are always complaining that, you know, they're taking advantage of kind of storage or compute costs way too high and they aren't able to move. So the whole Shopify idea is basically moving the DC, moving the cloud stack on edge to their local edge. Right? So they all have the same cloud experience without moving, their data over, which saves a lot of efforts and time for, larger business enterprise that already have these kind of cloud, relationships.

00:07:55:04 - 00:08:14:11
Alex Yeh
So that's where it kind of the Shopify comes from. And if I may go into a little bit deeper, is we structure our cloud into kind of three building blocks. One is the GPU hardware where I have to discuss too much of that is where and where a top ten Nvidia partner. So we buy their, Nvidia GPUs.

00:08:14:11 - 00:08:52:19
Alex Yeh
And we designed in the same way as, kind of the reference architecture designed by Nvidia. Right. So what's the second stack is called the cluster engine. I think of it as the operating system, the windows or the hypervisor layer of GPUs. It manages controls. Of the GPUs provisions, everything. Right. And on top of that is the inference engine, which is the API model as a service layer where people will be able to not interact with GPUs directly, but they will just be picking their best, their the models that they want to use.

00:08:52:21 - 00:09:16:22
Alex Yeh
And so building applications or they already have their own fine tuned or prime trained models over and hosted on our models, and we can sell them in a API format, right, so they can bring their own model, or we can provide models that we built that's highly optimized. It's still whatnot. And so those are kind of the three plus building blocks.

00:09:17:00 - 00:09:39:20
Alex Yeh
And reason why I say it's Shopify AI is because people can choose which blocks they want, so they can buy these three things in combination. They can if they want out experience, they can host in our data center. That's fine. Right. But if the enterprise say, I already have GPUs, just give me the, the cluster engine, which happens a lot, then we can we license that piece right.

00:09:39:22 - 00:09:59:00
Alex Yeh
And they want both the cross engine and the inference engine. We can also do that. Right. Or if they have both the cost engine I have my own orchestrator. I have my own my own GPUs. But hey, I want your model library, right? I want all this model. Is this model library to play with. We can license that.

00:09:59:03 - 00:10:13:09
Alex Yeh
You know, this is a container, right? So we can move that over. So it makes it very seamless for anyone that wants to build AI instantly, without kind of disrupting their existing, data flow.

00:10:13:11 - 00:10:33:02
Chris
So you're doing something that's that's really kind of interesting, right? I mean, one, it's like you've got kind of a, a consumption model for consuming GPU, really, or just I mean, you're sort of abstracting the GPU away and just saying, you know, like, here's resources that you need to do what you're going to do, right? Which I think is really interesting.

00:10:33:04 - 00:10:58:16
Chris
And I think, you know, you're taking a lot of the like, the hard part or even the part that isn't within the core competency of most businesses to do it, kind of stripping that away, which is really interesting. But I think, you know, when we talked earlier, there's another aspect of this that I think is really interesting, too, because, you know, so much has been focused on the amount of resources that it takes to do training.

00:10:58:18 - 00:11:14:15
Chris
Right. And, and there's been a lot of focus on building, you know, clusters for training AI, but your your vision is more on the inference side. And you're seeing the bigger business really kind of emerging there. Could you talk about that a little bit?

00:11:14:17 - 00:11:42:08
Alex Yeh
That's right. So I have a I have a CEO of the market is my subjective kind of view of the world. This feels a lot like, kind of the 2010 to 2014 era where iPhone just becomes pretty decent. LTE is actually, at least we're at 4G, around that. Right? And it's just like, you can do video streaming, can do like these high bandwidth, stuff.

00:11:42:10 - 00:12:12:12
Alex Yeh
Right? So you have like hoopers and Airbnb, these are really great apps, being, being built. And I see that, literally, you know, if you go to two bottle and Silicon Valley, everyone's talking about agent. Everyone's like building their own app, right, so to speak. So I feel the same as, currently. Right now, like everything is getting started because in the last two years, these really, you know, top companies have spent billions of dollars doing these foundational model training.

00:12:12:16 - 00:12:35:19
Alex Yeh
Right? Many billions of dollars, OpenAI and Tropic and Deep, Sea Org and Cuba and all these, like Magnificent Seven. Right. And also the Chinese shops open sourcing a lot of things. And now the models are good enough so that people like, oh, wow, this is actually pretty good. And they're open sourced, right? So they can start building different applications themselves.

00:12:35:20 - 00:12:58:05
Alex Yeh
And you no longer need a platform level or infrastructure level person, right. Because you need to provision all the GPU's to actually quite difficult. And so not many people have that capabilities, but you had to have a team yet you had to have some sort of company in order to do that. So that means like you need to raise like 5 to $10 million in order to do something like that.

00:12:58:09 - 00:13:25:07
Alex Yeh
But now it's literally like a college grad you will be able to use. I have an idea. I can pull the API and start building cool, cool apps. And the coding part you can outsource that to like, would server or person. So it just makes the entire deployment process so simple and so fast. And so that we just, I just see a lot of proliferation of, different products, different being built.

00:13:25:07 - 00:13:44:12
Alex Yeh
It doesn't have to be like a genius idea. It should be a very simple kind of search query of a specific kind. Yeah, right. One is like the reason I just upload a Bible. Right? And then I try to church, like, ten bucks a month. Yeah, it's like a cheap system. It's like, it's just, like a very simple idea.

00:13:44:12 - 00:14:02:18
Alex Yeh
And you'll be able to monetize AI scientist influencers and set have some search, search query on all the competitors and things like that. But like AI has been pretty in it, pretty good kind of doing these type of things. And people are just building different, models right now.

00:14:02:20 - 00:14:29:19
Chris
So. So the interesting thing about that is, like, if you look at like the, you know, some of the big seven you kind of mentioned there, and what we're seeing is like there's just massive infrastructure costs to sort of training these models. But at the same time, you're seeing the commodification of large language models of foundational models that the, you know, like the cost per token, one, I think from, you know, like 20 bucks to like $0.07 or something now or even lower.

00:14:29:19 - 00:14:53:02
Chris
Right? I mean, so so, you know, the economic case of, of, of building there, I think are strained. Whereas like what you're, what you're doing with, you know, regarding to focusing a lot of like the tools that you need to like do the work with it. I think that's I think that's really where there's more upside potential at the end of the day.

00:14:53:04 - 00:15:11:05
Alex Yeh
100%. I mean, I see I forgot which the CEO or someone, someone said it like, if you have a ten x reduction in cost of some kind, you will have a, have a 100 x 1000 x increase in, in,

00:15:11:07 - 00:15:12:03
Chris
Consumption.

00:15:12:05 - 00:15:31:23
Alex Yeh
I think in consumption. Right. It's a kind of the same thing, the same thesis as kind of Elon saying, hey, if we if we lower the cost of space travel by ten x, then there will be a lot more applications being built. Same with like planes, right? Like 50 years ago, only the rich, the rich can can fly planes now.

00:15:31:23 - 00:15:54:18
Alex Yeh
Like, literally anyone can fly on a plane and go, go, go somewhere else. It's like a hundred bucks. Yeah, but you know, 200 bucks. You can plan it and go, go go go go places. So I'm seeing the same thing, literally like quad. I think it's last I checked is like 32 bucks per million token and like a deep SK model, which is on par, if slightly better.

00:15:54:20 - 00:16:17:15
Alex Yeh
It's $2, right? Some of them are even less than that. So that's like 15 bucks cheaper. And so when you have that, you're just like, oh wow, I can build this now. It makes economic sense. And so, so there's just so many cool apps, and different agents or applications is being built, right in front of eyes.

00:16:17:17 - 00:16:43:10
Alex Yeh
And so I would categorize into kind of I has three categories, right. One is training inferencing and fine tuning. Sorry. Let me put it, in right order training, fine tuning and inferencing. So it's it's this loop, right, of, you have to it's like model is like a kid, right? Right now. So training is like you put them in in schools, the K-12 education.

00:16:43:15 - 00:17:01:16
Alex Yeh
Right. They going through that course and then. Fine. And they go into production production needs like inferencing. So you throw them out into a real world, like, right. You're 25. Yeah. Get out there. Go survive. Right. Get out there. Exactly. And then they will go through some struggles and they up. They have to update their knowledge base.

00:17:01:19 - 00:17:16:05
Alex Yeh
Right. And so they go into that fine tuning base like okay I, I got more information. Let me kind of, self-reflect on how to be better in this particular domain. Right. So that's cool. Fine tuning it.

00:17:16:07 - 00:17:23:04
Chris
I like that because, you know, when I think of the kids, you know, they're yeah, by the time they get to high school, they know absolutely everything.

00:17:23:06 - 00:17:25:09
Sandesh
That they need to.

00:17:25:09 - 00:17:28:10
Chris
Do. Just need a little fine tuning.

00:17:28:12 - 00:17:31:09
Alex Yeh
Yeah, exactly. To rough out a little bit of edges, you know?

00:17:31:12 - 00:17:34:09
Chris
Yeah, a little rough around the edges.

00:17:34:09 - 00:18:05:15
Alex Yeh
Yeah, yeah, yeah, exactly. So. So right now, it's literally. I see my customers are all doing fine tuning and inferencing, with the exception of, like, particular domain, which, are still doing training, which is like, like, foundational gaming engine gaming. All right. These have a niche, domains. People are so doing training. Right. And then there's all the big, big boys are still doing training, right?

00:18:05:15 - 00:18:25:05
Alex Yeh
They just want to keep better because the Chinese and Americans and all that and sevens are competing with each other. So they just will continue to try new innovations on software. So the models will just get a lot bigger, bigger and bigger and bigger. They'll distill just get better, and then people will continue to open source that. Right.

00:18:25:05 - 00:18:42:10
Alex Yeh
So it's a closed source versus open source and they'll continue to compete, which is a really amazing for everyone else who just like all right great. We got better model now. And so you don't have to think about the costs and their performance because it's you just have to have a assumption that it will just get better because it will.

00:18:42:11 - 00:18:44:04
Alex Yeh
It has been at it. Well.

00:18:44:06 - 00:19:11:14
Chris
Yeah. Well, I think, you know, there's another aspect of this too, that I think is interesting and, you know, like, when you look at Nvidia and some of the, the, the precedents that Jensen had, I think around building GPUs was that, he spent the whole company, spent a lot of time on developing the software around GPUs, the, you know, the tools that you need to make the GPUs go.

00:19:11:14 - 00:19:38:11
Chris
And I think that's kind of where you're at here, too, right, with the cluster engine. And you're building this hypervisor and just and some, some of these inferencing tools and things like that, you're, you're, you're, you're, you're taking the hardware and kind of giving people the tools to use the hardware. That's right. So I mean, do you, do you envision that as being sort of like, the, the, the core of your business going forward, building more tools and, and just like kind of further abstracting that piece.

00:19:38:11 - 00:19:38:17
Chris
Yeah.

00:19:38:23 - 00:19:57:13
Alex Yeh
Okay. Yeah. So here's, here's how I think think of think of the world as a separate model shop for continue Company. I think this is like a ground truth okay. Capital company tokens are going to be cheaper and then models are going to be better. That's one ground truth. Another ground truth is all the hardware companies are are will continue to compete.

00:19:57:15 - 00:20:21:22
Alex Yeh
And and Nvidia will compete, continue to compete with itself. And they come up with like with 2 or 3 x better GPUs every single year. And AMD and other kind of specialty chips are to compete. So they're competing at the hardware level. Model shops are competing on the software level. And what we are doing doing is we are competing in the system level.

00:20:21:23 - 00:20:45:16
Alex Yeh
So we put these two together right. This inference engine is about right. We put it models and put it on our on our GPUs. So we integrate on the system level. Sometimes we go below Cuda level. So very infrastructure. So we make sure that per GPU we can squeeze out the maximum amount of token for users. Right. We're optimizing on the system level.

00:20:45:16 - 00:21:13:20
Alex Yeh
So that's that what we're doing. And so that's I would say a larger thing that there will be will be will continue to compete. We will continue to make sure on a system level everything will be better. Right. And then we are building and abstracting out different difficulties. Right. Provisioning the GPUs, or providing more and ML frameworks and tools, for people to better fine tune.

00:21:13:21 - 00:21:34:21
Alex Yeh
Right. Remember, like once you're going production, you have to update your knowledge base. So it goes back to it goes back to to to to the GPUs and then do some, some minor small training. It's called fine tuning. Right. Small training. And so that cycles continue. But this is like when you have new data coming in, you still have to ingest it.

00:21:34:21 - 00:21:57:12
Alex Yeh
Right. You have to ETL it, clean it. Yeah. Sorry. And label. Yeah. But we're not even providing that. But we will integrate other services like like great companies like scale and so forth. But we want to integrate all of that to build our ecosystem so that people can, easily fine tune their, their, their models, their products and our platform and then push it out to production again, inferencing.

00:21:57:14 - 00:22:13:17
Alex Yeh
And, and then so, yeah, so we're optimizing on a system level and then on a smaller subset of that, it's like we will provide better tools, better connections. Interesting. Same thing. Right. Well, lower latency, more global coverage, etc., etc..

00:22:13:19 - 00:22:38:03
Chris
You probably have like, yes, as your product matures, you probably are going to have like multiple tiers. I would imagine of, of service that you're going to be able to, to deliver to people. Do you envision like in the future? Providing like just a general like SLA based approach to this kind of consumption model?

00:22:38:05 - 00:22:56:14
Alex Yeh
Yeah, yeah we do. So we offer ten minutes, one hour and two hours. And so we will continue to squeeze that time. Right. So I'll explain what that means. Ten minutes is within ten minutes we will have, our engineering team reach out to you if you have any questions. Right. And then one hour means full system diagnosis.

00:22:56:14 - 00:23:29:06
Alex Yeh
Right? We will identify and tell you, okay. Within our here's a problem. Right. And within two hours, you should expect full system recovery. Right. So this these are general SLA that blankets all our customers. That's a requirement. But our end, and then there will be kind of minor different product lines that people need. But the big our lab shops, they they know everything and have a lot of resources have, in preparation for that platform engineers, they also have application engineers.

00:23:29:11 - 00:23:53:11
Alex Yeh
They have model engineers and researchers. Right. So we give them kind of the full control for access, of open ports, different ports, and, you know, file, firewalls of control, router control, all of that. Right. And then some other suites, maybe kind of, just student groups, like some like founders, like early stage founders that want to build stuff.

00:23:53:11 - 00:24:18:19
Alex Yeh
They just want the data. And we can also offer that we abstract out a lot of, that, infrastructure, sources so that they can have the best experience without being scared of like, well, there's a lot of things I do but don't know what to do. So we have starting to bifurcate into different product suites, but they are all be planted by about 10 minutes or 1 hour or two hour.

00:24:18:21 - 00:24:43:11
Chris
Since you're since you're abstracting so much the same way, are you are you provide I mean, it sounds like you can provide everything from like almost bare metal to, you know, further up the stack and give, you know, tweak ability along that and different, different capacities. Are you doing a lot of virtual GPUs stuff as well, or, you know, like where you're you're slicing them up even into finer increments?

00:24:43:13 - 00:25:08:17
Alex Yeh
That's right. So, we offer two major type of service. One is, bare metal, and different level, bare metal. If you want full access, we give you full access. We can also offer you containers. Right. So you can, have a bank of GPUs, like a GPU. Sorry. Then can compute resources, and then you can open and close different containers.

00:25:08:17 - 00:25:33:07
Alex Yeh
Maybe you want to do fine tuning here. You can train here and inferencing there can open and close. Right. So those are different things that, that we offer. We size GPUs and where, it's on our roadmap to integrate with run AI, which is a company that, Nvidia acquired and they open source, which is, you know, they're very good at, parsing ton of memory of, of GPUs.

00:25:33:07 - 00:25:47:00
Alex Yeh
So that will be on our roadmap to, to kind of do that. So yeah, so we will continue to integrate and, innovate on a system level, to make these things happen. So what are the.

00:25:47:02 - 00:26:18:20
Chris
Economic of building these facilities look like? Because it's, I mean, you know, like, you out of China because he didn't like the way the power consumption was going on. Bitcoin. Right. And and so like now you, you know, everybody takes these to, to different places. I mean obviously this is a, I mean we, we literally are having conversations about renewing the nuclear power industry, you know, in a, at a very rapid pace to, to, you know, sort of supply power here.

00:26:18:22 - 00:26:24:12
Chris
You know, like, how do you how do you figure out where to put your data centers, where to get the get that power?

00:26:24:12 - 00:26:52:10
Alex Yeh
So yeah. So it's very interesting question. Like I like talking about power, because it actually is it's a bottleneck right now for our America and for a large part of the world. So, you know, we don't operate in China, so we don't know, like, what's happening there. I heard that they're very, very fast. But in the places where we operate, we're in Japan, in Taiwan and Thailand and Singapore and in the US.

00:26:52:12 - 00:27:16:21
Alex Yeh
Right. And, having built mining, kind of understand how US operates. So, and also in short, to build a data center, you need three things. One is power, fiber as well as internet. Right. And water, these three things, and those are each are very, very crucial. And people talk about an overemphasis and emphasize about power.

00:27:16:23 - 00:27:43:01
Alex Yeh
But also fiber is also important in terms of the the lead time. So for us doesn't surprisingly doesn't lack power generation because US is, I think, 70% not gas and natural gas power. So the cool thing about that gas is literally you order a turbine and you can pump that gas in and there's boom, 300MW increase over overnight, right?

00:27:43:03 - 00:28:10:16
Alex Yeh
Nuclear is a bit more difficult, but technically you can generate power very, very fast. But you need these three things together, right? And these power facilities are typically in a very remote places. So you want to pull up again, right. These are not kind of your phone. You can survive with a 100MB per second. These data center requires, redundancy of fiber cables pulling from the central hub.

00:28:10:17 - 00:28:35:22
Alex Yeh
So you need, like, at least 100 gigs of, dual line fiber. So just in case one somehow gets snapped, then you have to know, done. So also, water is for cooling, right? And so you need to have all of these three together. So what U.S lacks is actually distribution of power, not generation to power. And distribution is actually again bottlenecked by regulations.

00:28:36:00 - 00:28:57:01
Alex Yeh
And these power companies are just but legacy businesses. So they, they move slower than the data centers and I guys require I guys it's like we want a tomorrow. We want a next month. And and power companies like the best I can do is two years, you know, so, so all the lead time.

00:28:57:01 - 00:28:59:06
Chris
On Transformers can be really long, right?

00:28:59:06 - 00:29:04:23
Alex Yeh
So yes yes, yes, yes. So that's kind of the economics on the power side and data center.

00:29:05:02 - 00:29:10:01
Chris
Yeah. I mean it's like that old saying, you know, the future's already here. It's just unevenly distributed.

00:29:10:03 - 00:29:14:10
Alex Yeh
Yeah. Yes. Yes. Actually yes. Yeah I like that.

00:29:14:12 - 00:29:32:05
Sandesh
So Alex, obviously you're in a huge market, so, I have to think you don't have your your customer base. Your ideal customer base can't be huge, I would think. So. Who is your ideal customer?

00:29:32:07 - 00:29:56:07
Alex Yeh
My ideal customers are these AI native companies that want to scale right, that wants to fine tune, that wants to train, that wants to influence. That would be my ideal customers. Right? So they they have the knowledge base of what they what they need to do. They just need someone to provide really good infrastructure. And what we provide, what we're really good at is system level optimization.

00:29:56:09 - 00:30:28:19
Alex Yeh
So if I may go into kind of the technical, details a little bit. So right now all the models are going into the multimodal world or mixture of expert. Right. So meaning that the the latest llama for right. They have 120 experts. What that means is like if you ask them, if you ask them a very detailed like question, like how do you compare ten of us trying to complex things like that, like kind of complicated detail.

00:30:28:19 - 00:30:48:00
Alex Yeh
So it requires, kind of the long term like, long compute time thinking. Right. Same with humans. Like, oh, let me get back to you in an hour. Let me do some thinking. All right. So you want to pull at the largest model possible, right? The smartest model possible. Right. But if you like, what is the color of AirPods?

00:30:48:03 - 00:31:12:07
Alex Yeh
Right. It should be quick thinking. You you don't want to interact with that is long term big model, right? You want to interact with as small as possible model which is like quick and give you an answer that's called that's called mixture of expert. Right. You have different experts. And so right now the current thing is you're putting all of these causality in one model in one GPUs by one server.

00:31:12:09 - 00:31:39:04
Alex Yeh
And so you're running the process physically on the same hardware. So that makes things very very slow. And so you would probably think of like okay, we should put it in different GPUs. Right. And so you essentially kind of, delegate out all the difficult task to different people. But for these deep research, you probably will. Same with people, right?

00:31:39:09 - 00:31:56:08
Alex Yeh
If you ask this deep research problem, do you want to ask probably like poly science PhD at Stanford or whatever, right. And give you an answer now if you ask like what is this color literally anyway, grab on industries like, well what color is this. So you want to delegate these task out. Same thing is what we're doing as a system level.

00:31:56:08 - 00:32:28:01
Alex Yeh
We delegate task out to different GPUs and optimize. Is that right. So I think that's that's that things that's things like what what what we do. On on the compute side like during transit is that between compute and as also at storage, which is kind of, when, when things are set, we put it in different locations so that it's, distributed, for optimized, performance.

00:32:28:03 - 00:32:48:09
Chris
I think in the early days of labs, people weren't, realizing, you know, that they were sort of surprised by the fact that, statistical models of the English language couldn't do math, for example, you know, and, and that, you know, these simple things needed to be added in around the edges to sort of complete that picture. Right?

00:32:48:11 - 00:32:55:01
Chris
So that's, that's we have we have come a long way in a very short amount of time, which is really, really interesting.

00:32:55:06 - 00:33:22:03
Alex Yeh
So right now if you just want to API literally go on website and you can choose choose your own poison chisel model and you can start playing with it like right now, you don't need to talk to anyone. Another way would be talking to cells. If you're, a startup and you're scaling, fast, and you need people to kind of more, more bespoke services, then we can also offer that, and give you kind of the GPU resources.

00:33:22:04 - 00:33:44:06
Alex Yeh
I'll give you an example, like we have a customer, and we wait because we have data globally. So we're able to scale, quite broadly because everyone is building apps. Right? You don't know where customers are coming from. You kind of kind of know they're like, oh, we're some some people are from like East Coast, West Coast, as well as somewhere in Asia, maybe in Japan or Taiwan or wherever.

00:33:44:08 - 00:34:09:18
Alex Yeh
And that's it's like telecom, you need coverage. Right? So it's the same thing that we, we, we offer these type of things. So one example is we have a customer sort of customer in a first month at scale eight x and they're a month and the second month to 11 x. And we were able to perfectly match the customer growth curve, then give them the right, the right amount of resources, for them.

00:34:09:20 - 00:34:36:10
Alex Yeh
Right. So this is this is the first phase, I think, next couple months, we we're going to roll out auto scaling. So it will be able to detect automatically and scale accordingly. For the customers and then also provide the lowest pay possible. Right. So if the customer is coming from Japan, then we will be able to ping the Japan data Center and provide resources for them.

00:34:36:12 - 00:34:57:00
Sandesh
So from a competitive perspective, how do you guys win? Like what is the you know, couple things that you can kind of point out that you feel you have an edge on your competition and you know and feel comfortable to, you know, speak as freely as you'd like, you know, because I think this is a really interesting conversation, to hear about.

00:34:57:00 - 00:34:59:06
Sandesh
How do you guys differentiate?

00:34:59:08 - 00:35:08:10
Chris
And let me add one more thing to that. Like, who do you view is your competition? Is it the AWS of the world or is it, you know, other neo clouds?

00:35:08:10 - 00:35:36:07
Alex Yeh
I think it would be the other other neo clouds would be my competitor. So I would say two faces were the only US company and I if I can in fact check this with, with Nvidia, it's like as anyone operating in Asia. Not that I know. So, so we're the only neo clouds that have robust data centers, directly operating these, direct, data, not, directly, in Asia.

00:35:36:08 - 00:35:57:09
Alex Yeh
Right. So if you're building a consumer app, right, and you have customers globally, we will be able to service you where probably the only US shop that can that can help you with it. Right. Again. Right. Is literally from East Asia to Southeast Asian and US East, US East and US west. And so we basically have the entire, coverage and we're expanding to Europe.

00:35:57:09 - 00:36:28:12
Alex Yeh
So that's one thing. So global coverage, second thing is the whole system integration. So what I talked about kind of partitioning to segregating, different experts as well as compute, is a very novel thing that's just happening right now. And so to my knowledge, none of the shops neo clouds alike have these capabilities. Everyone's just storing in their shadow, shoving them in one server at this moment.

00:36:28:14 - 00:36:54:01
Alex Yeh
And so, you basically get a bang for your buck space, like you're renting the same GPUs, but my output is 70% more than other shops. And so even though our price point is slightly more expensive, like 20, 30%. Right. But your total output is just much more. And everyone's like everyone's GPUs, like pushing 10 million tokens per per minute or something like that.

00:36:54:03 - 00:37:24:08
Alex Yeh
And I'm pushing like 17 million. Right. And so if you're if you're saving a lot more money because we do very deep, hardware and software integration. The cool thing is we were able to replicate what Deep Sik was able to do. They basically, the reason why they broke the need today is because they they went below cuda ai able to control, partitioning kind of their, their, their, their models.

00:37:24:10 - 00:37:51:13
Alex Yeh
I think there was some news that came out, a while back was like, their cost is like ten x lower than anyone, but they have 500% profit. Now, how did they do that? Right. It's because they do very low level integration. And I think this is the biggest thing of the I needed cloud and traditional Hyperscaler clouds by Hyperscaler clouds are basically abstracting resources, right?

00:37:51:13 - 00:38:20:06
Alex Yeh
Their traditional way of doing things, abstracting resources is basically all the CPU, GPU, storage, networking. These four things abstract out, you pull applications. That's all the hyperscalers are building exact same way. And the I need the clouds. At least we are, building a truly vertical cloud, right? Is we integrate the software and the hardware to make sure that it it functions as a whole.

00:38:20:06 - 00:38:46:16
Alex Yeh
That's why, you know, iOS is just better than like, windows in terms of user interactions is speed and responsiveness, right? It's because Apple owns the just the the hardware. Right. They designed the hardware that makes things more efficient. So we're designing the same it is same method is the hardware and software integration. And so I would say these two things will be our advantage.

00:38:46:16 - 00:38:53:09
Alex Yeh
One is our global reach. And the second is our kind of take tech capabilities and flexibility and speed.

00:38:53:11 - 00:39:15:23
Chris
And I get to think that, you know, the pace of innovation in the world of AI has been so incredibly fast that there's been a tremendous amount of optimization that's just not happened or been missed. And like the fact that you're focusing on doing that optimization, I gotta imagine there's there's just a lot of gold there, right?

00:39:16:01 - 00:39:39:14
Alex Yeh
Yeah. I think, future GMI again, with what the name name, means. Right. It's general machine intelligence. We want to build a future with people, right? We're an enabler. We want to be your painkiller, for all the accidental. So you have, so the founders and enterprise can just focus on building really amazing applications.

00:39:39:16 - 00:40:00:00
Alex Yeh
I think the most exciting thing about building GMI is you see so many amazing startups and it's informed, I think, and that's really excites me. So we want to be kind of the, the Robin of, of of of AI word, where our customers can be the Batman and, and building these shiny applications, in front end and public.

00:40:00:00 - 00:40:27:23
Alex Yeh
Right. So I think that's a really, really cool thing. And so we just want to focus on being an infrastructure, provider, to support making sure that your apps for agents run smoothly and making sure that, the, your models are, is trained as, as you want it, and providing the best and easiest access to resources, and asset to resources if that's what our customer needs.

00:40:28:01 - 00:40:38:12
Alex Yeh
So that's kind of how I see, going forward. And we'll continue to optimize on the system level to ensure that our customers have the best experience.

00:40:38:14 - 00:40:45:15
Sandesh
And where can customers find you, GM, AI, cloud, AI, or what's the best?

00:40:45:17 - 00:40:53:00
Alex Yeh
That's right, GMI cloud the AI. And if they want to find me, I'm on LinkedIn, and, people can reach out. Reach out to me.

00:40:53:02 - 00:41:01:04
Chris
Awesome. It's exciting to see what you're building. Congratulations on your success to date. And, can't wait to see where you go from here.

00:41:01:06 - 00:41:02:13
Alex Yeh
Absolutely.

00:41:02:15 - 00:41:05:13
Sandesh
Hey, Chris. How hard could it be? Yeah, how hard?

00:41:05:15 - 00:41:09:07
Alex Yeh
How hard could it be? You know.

00:41:09:09 - 00:41:11:04
Chris
Well, I think you're finding out how hard it can be.

00:41:11:04 - 00:41:15:15
Alex Yeh
Yeah, pretty difficult every day. But, I go to places I think persistent is number one.

00:41:15:21 - 00:41:21:05
Chris
But at least you're doing it so that you're doing the hard stuff so that other people don't have to. And that's the important part, right?

00:41:21:08 - 00:41:32:11
Alex Yeh
Yeah. Hey, you know, I, I bought land from my old, you know, cowboy. Yeah. In Texas. In Arkansas. Yeah. Yeah. After that, I think was everything. You know, they want more smooth sailing.

00:41:32:11 - 00:41:47:08
Sandesh
So awesome. Well, we're we're definitely going to keep watching you guys. And we really appreciate the, the partnership here in collaboration. You are awesome to talk to you. And and I hope you come back to the show.

00:41:47:10 - 00:42:11:04
Alex Yeh
Absolutely. And I feel free to reach out and have you, have you grab your coffee and we can talk about, like, different cool applications because, again, I'm the only US company that have really real insights into how Asia is developing AI. Right? I listen to I call in Pod and things like that. They just like, talk about China and talk about Asia.

00:42:11:04 - 00:42:40:11
Alex Yeh
I'm like, I really want to go on and show and talk about kind of like to you if this is like real happening and I just have a lot of things I want to talk about, especially me coming from a Taiwanese guy and a city and squeezed between both, both both worlds. Right. And then seeing how Japan's developing their own stuff, how Singapore is developing and how U.S is developing, and see everyone's kind of view of the of the world colliding, and different technology, different trends.

00:42:40:11 - 00:43:06:02
Alex Yeh
And what's hot because, you know, I can I can weed out all the talking about like how I'm doing far fast, but I see it on my TV. Is I going fast or just racing without doing much right? And so there's a bunch of different pockets, different markets. And so we can talk about this like I'm, I'm pretty sure two hours about, yeah, cool stuff or power or.

00:43:06:04 - 00:43:13:15
Sandesh
Yeah. On the next podcast. Hey, I think that's the cool thing, right? AI is the gift that keeps on giving and we are just getting started there.

00:43:13:15 - 00:43:24:09
Alex Yeh
We just started just started. This is 2010 like everyone is building app. We haven't seen these like major killer apps yet. I do still another kind of revenue.

00:43:24:13 - 00:43:30:15
Chris
Yeah but but I like 2 or 2 years is like 100 years in. I years I feel like.

00:43:30:17 - 00:43:56:02
Alex Yeh
Yes I works in seven days increments. Every within seven days there will be some shop open sourcing some new things or open, oh, I come up with new models. New applications, new. Yeah. New products. Down product market fit scale into 100 million. I do it every seven days. It will be a major news. Come out from that from the shop.

00:43:56:03 - 00:43:58:00
Alex Yeh
In around.

00:43:58:02 - 00:44:06:14
Chris
I think we've just replaced Moore's Law with year law of seven days.

00:44:06:16 - 00:44:10:22
Sandesh
Yeah. You heard it here. I heard there's.

00:44:11:00 - 00:44:13:19
Chris
Every seven days.

00:44:13:21 - 00:44:15:09
Alex Yeh
Okay.

00:44:15:11 - 00:44:16:18
Chris
Thank you so much for being on.

00:44:16:18 - 00:44:22:21
Sandesh
Really appreciate it, Alex. Yeah, absolutely. Thank you so much. And we look forward to the next one. Take care.

00:44:22:22 - 00:44:25:05
Alex Yeh
Thank you Chris. Thank you. Sandwich.

00:44:25:07 - 00:44:49:15
Sandesh
Well, that's the show. But before you roll sales pitch warning we're building more than a brand here. We're building a community. So your support means everything to us. So please like, comment, subscribe, drive, follow and also reach out to us directly. We want to hear from you. Thank you so much for your support. Until next time.